18 Replies Latest reply: May 15, 2014 10:04 AM by ServerGuy _ RSS

    Monitoring stops without an event

    joey

      We had an issue where the Data Pump monitoring service seems to have stopped.

       

      When I looked at the Server Status page, everything seemed to be running as normal.  But, there were several monitored files sitting in the directory not processing.  I checked the Event Viewer (both for just Data PUmp and the system log), and there were no errors since the last job had processed.

       

      Shutting down Data Pump and starting it back up (via the Server Status page) caused all of the monitored files to process instantly.

       

      I can't find any error that indicates monitoring stopped, but from the behavior it did stop.  This has repeated itself sporadically, perhaps once every few months.  Has anyone else experienced a similar situation?

       

      Are there any logs other than the Event Viewer or diagnostics that may be helpful in determining why monitoring has stopped and needs to be restarted?

        • Monitoring stops without an event
          Chickenman _

          For the past year or so we have experienced sporadic failure of DP to kick off monitored files - but only some files, other regularly scheduled submissions process normally.

           

          The forum suggested changing the default printer - for whatever reason - and that has worked in some cases but not many. In the most extreme case the project was deleted and reconstructed exactly as had and worked fine - go figure.

           

          The logs are cryptic and do not lead to a root cause. We finally reported this issue to Tech Service about a week ago but as yet have received no reply. Will update when more info is available.

           

          CM

          • Monitoring stops without an event
            pranitj84 _

            I experienced similar issues last night where everything seemed to be running as normal, but there were several monitored file that were not processed. 

             

            I did experience another problem.  One of the process did not run even after rebooting the Data Pump, so after I ran that process manually another five process ran.  Is there anyway we could tell what processes are being held up?

             

            Also the process that was being held up gave the following error message when I checked the Monarch Data Pump Events under the Events Viewer:

             

            DwchServer.ReloadMonitoringTablesException: Error occurred while loading the monitoring tables. ---> DwchServer.InvalidFindFirstChangeNotificationException: Error setting up initial change notification for
            corporate\finance\PROD\URS\DOWNLOAD - error 1

               at DwchServer.ChangeNotification..ctor(String strPath, Int32 iWatchSubtree, FileNotifyChange notificationType)

               at DwchServer.w.b()

               at DwchServer.d.d()

               --- End of inner exception stack trace ---

               at DwchServer.d.d()

               at DwchServer.d.e()

             

            I would appreciate if I could get some feedback on the problem.

             

            Thanks,

            Pranit

              • Monitoring stops without an event
                joey

                Pranit,

                 

                We experieinced a similar situation as what you are experiencing, and have been able to resolve it.  What version of Data Pump are you using?

                 

                In your case, you are recieving an error on the event log that you posted.  In our case, we are recieving nothing on the event log to indicate what is wrong.

                 

                 

                Chickenman, strange that changing the printer would change anything but I guess it did.  In our case, it is not a few specific processes that are failing, but all monitored processes.

                • Monitoring stops without an event
                  Gareth Horton

                  Hi Pranit,

                   

                  That is quite an unusual error, could you give me a few more details please:

                   

                  What OS and network environment are involved in this, both from the MDP machine side and the network resources you are monitoring. Of particular interest is the drive that you are monitoring.

                   

                  The error is from the Windows API that deals with file events and is saying

                   

                  "If the network redirector or the target file system does not support this operation, the function fails"

                   

                  This means that the filesystem on this drive/folder cannot be fully supported by the Win32 FindFirstChangeNotification API.  As you can imagine, this is quite unusual and we have not come across it before.

                   

                  I don't know if this can be caused by network connectivity issues or other issues related to the network stack.  Has the folder you are trying to monitor ever worked correctly, or is this an first time/intermittent issue?

                   

                  Regarding your other point, there is no simple way to tell what processes are held up by the "deadlock" as it depends on which of the monitoring threads they are assigned to, which dependent on a number of factors.

                   

                  If there are issues with network connectivity and certain folders which cuase problems, then it is better to move the monitoring to folders local to the Data Pump installation, as Joey has done.

                   

                  There is also another method of changing the monitoring behavior listed here [URL="http://www.monarchforums.com/showpost.php?p=9826&postcount=6"]http://www.monarchforums.com/showpost.php?p=9826&postcount=6[/URL]

                   

                  One strategy to increase the "robustness" might be to increase the NumberOfMonitoredThreads value and decrease the NumberOfFileSpecsPerThread value, meaning that if a thread is killed by an error such as the one above, it will affect less monitored files.

                   

                  Please note the caveats listed in the post.

                   

                  Gareth

                   

                  I experienced similar issues last night where everything seemed to be running as normal, but there were several monitored file that were not processed. 

                   

                  I did experience another problem.  One of the process did not run even after rebooting the Data Pump, so after I ran that process manually another five process ran.  Is there anyway we could tell what processes are being held up?

                   

                  Also the process that was being held up gave the following error message when I checked the Monarch Data Pump Events under the Events Viewer:

                   

                  DwchServer.ReloadMonitoringTablesException: Error occurred while loading the monitoring tables. ---> DwchServer.InvalidFindFirstChangeNotificationException: Error setting up initial change notification for
                  corporate\finance\PROD\URS\DOWNLOAD - error 1

                     at DwchServer.ChangeNotification..ctor(String strPath, Int32 iWatchSubtree, FileNotifyChange notificationType)

                     at DwchServer.w.b()

                     at DwchServer.d.d()

                     --- End of inner exception stack trace ---

                     at DwchServer.d.d()

                     at DwchServer.d.e()

                   

                  I would appreciate if I could get some feedback on the problem.

                   

                  Thanks,

                  Pranit[/QUOTE]

                • Monitoring stops without an event
                  Gareth Horton

                  Joey,

                   

                  Did the service actually stop, or was it still running, but the monitoring was not working?

                   

                  Gareth

                   

                  We had an issue where the Data Pump monitoring service seems to have stopped.

                   

                  When I looked at the Server Status page, everything seemed to be running as normal.  But, there were several monitored files sitting in the directory not processing.  I checked the Event Viewer (both for just Data PUmp and the system log), and there were no errors since the last job had processed.

                   

                  Shutting down Data Pump and starting it back up (via the Server Status page) caused all of the monitored files to process instantly.

                   

                  I can't find any error that indicates monitoring stopped, but from the behavior it did stop.  This has repeated itself sporadically, perhaps once every few months.  Has anyone else experienced a similar situation?

                   

                  Are there any logs other than the Event Viewer or diagnostics that may be helpful in determining why monitoring has stopped and needs to be restarted?[/QUOTE]

                    • Monitoring stops without an event
                      joey

                      Gareth,

                       

                      We had this issue again last night.  When I looked at the Services snap-in, I saw that Monarch Data Pump 10 was started.  When I viewed the Server Status page, it looked to be normal.  However, there were over a dozen monitored files waiting to run.  I clicked stop service and start service and that did not kick off monitoring.

                       

                      I then did shutdown and startup pump, and that caused all of the jobs to start.

                       

                      Hope that answers your questions. We're looking for a way to keep tabs on the monitoring.  For now, we 're goign to have someone periodically check the input directory.

                        • Monitoring stops without an event
                          Chickenman _

                          Joey,

                           

                          We monitor the input folder daily, but after moving to V10 of the Data Pump a month or so ago and deleting/re-creating processes of the chronic offenders we have not experienced a failure to monitor.

                           

                          CM

                            • Monitoring stops without an event
                              joey

                              There isn't a chronic offendor is our case.  It's the case taht all jobs stop monitoring.

                               

                              If there's one job causing it it's anyone's guess which one it is, as there is no error message in the event log.

                                • Monitoring stops without an event
                                  Jeff C

                                  We are having the same issue.  Just wondering if any solution has been found?

                                    • Monitoring stops without an event
                                      Tim Racht

                                      There is a reply from 3/2/2010 in reference to setting some registry entries.  Below is a copy of that reply.  Please let us know how you make out.

                                       

                                       

                                      BIG time Thank you Gareth!!  & this thread for resolving Event Viewer Error 210 (Too many file specifications for a monitored process) after upgrade from V.9 to 10.5. Fought this battle for about a week and a half before resorting to the forum's for desperate help.

                                       

                                      I used Gareth's link http://www.monarchforums.com/showpos...26&postcount=6[/url] to resolve the problem. Had to create these DWORD values because they DO NOT EXIST by just installing Datapump!

                                       

                                      "There are two DWORD registry settings in HKEY_LOCAL_MACHINE\SOFTWARE\Datawatch\DWCH Server that tell the Monitor how many threads to create, NumberOfMonitorThreads and how many file specifications to monitor within each thread, NumberOfFileSpecsPerMonitorThread."

                                       

                                      At first I made the value 10 as he said was the default. Recycled the registry & Datapump service and the 210 errors still popped up. I then made it 200 after not sure how many monitored processes the users had. I believe we have nearly 210(in quanity) monitored process. I thought this might be a direct correlation but through much trial and error I found the threshold at which the 210 Errors disappear. 27 seems to be the number for an error free Event Viewer after stopping starting Data Pump for us. I currently set it to 50 for some room to grow.

                                       

                                      I have NumberofFilesSpecsPerMonitorThread set to 10 for now.

                                       

                                      Why the installation doesn't install these registry keys is beyond me. 

                                       

                                      I even went to the work of moving this installation to a new server. It is currently running on VMWare Windows 2003. 4Gb RAM.

                                        • Monitoring stops without an event
                                          Tim Racht

                                          Gareth I am not able to get to the link as indicated above.  Is this link still available or was it broken with the new forum?

                                            • Monitoring stops without an event
                                              Data Kruncher

                                              Hi Tim,

                                               

                                              I think that the link should point to [URL="http://www.monarchforums.com/showthread.php?2418-Start-Process-from-Remote-Machine&p=9826#post9826"]this thread[/URL]. I found it using my custom [URL="http://www.google.com/cse/home?cx=017979669301925061066:c4v8ksw-llg"]Google search for the Monarch forum[/URL]. Is that what you're looking for?

                                                • Monitoring stops without an event
                                                  Tim Racht

                                                  Data Kruncher I believe this is what we were looking for as our IT staff was able to adjust the registry items once again.  Our problem is that we have a virtual machine and apparently when they upgrade the machine, the registry settings are wiped out and then our processes start to suspend for no apparent reason.  When the business unit reviews we have to start all over and then discuss with our IT.  Thanks for your help.

                                                    • Monitoring stops without an event
                                                      ServerGuy _

                                                      Hello, I am the one who made the posting above that Tim Racht posted above about making the 2 Registry keys.  I can't seem to find my old post which I created in Feb. 2010.  But if you are having problems with monitored processes not working this worked for me.  Keep in mind this was after we experienced problems with monitoring not working after upgrading to 10.5, so if you are at an earlier version I can't be certain this applies.

                                                       

                                                      In a nutshell:

                                                      I created 2 Dword keys:

                                                      Regedit to My Computer|HKEY_LOCAL_MACHINE|SOFTWARE|Datawatch|DWCH Server

                                                       

                                                              NumberofFileSpecsPerMonitorThread  10 (decimal)

                                                                NumberofMonitorThreads    50 (decimal)

                                                       

                                                      The Number of File Specs Per Monitor Thread is the value that resolved the issue for me.  You may have to test different with values depending on how many monitored processes  you may have.  IT isn't a 1:1 ratio with the number of monitored processes.  A value of around 27 was the minimum threshold to resolve the Event 210 issues when the department had around 200 monitored processes.  I made it 50 to provide some growing room.