FRS Cheat

Home Forums Windows Server Active Directory FRS Cheat

Tagged: 

This topic contains 0 replies, has 1 voice, and was last updated by  shanto_thomas 1 year ago.

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #2928

    shanto_thomas
    Participant

    Basic FRS scenarios

    • We are getting events  (Ex. 13568, 13508, 13509, 13565) refer to the link for event id related to source Ntrs :
      http://support.microsoft.com/kb/308406
    • Group policy not working because of morphing on the domain controllers.
    • Netlogon and sysvol not shared on the domain controllers and policies and scripts folders missing from the folder < %systemroot%>\sysvol
    • Netlogon and sysvol not shared even if we have the policies and scripts folders present (Junction points might not be correct.)
    • FRS attribute in active directory missing or incorrect. (ex frs-memberreference, frs-rootpath, frs-stagingpath etc)

       

      Troubleshooting

       1) In the event viewer if we are getting the event id 13568 (JRNL_WRAP_ERROR)

      • Take a backup of the folder policies and script under the location < %systemroot%>\sysvol on the domain controller where we are getting the event 13568.
      • open the command prompt in elevated prompt and run “Net stop Ntfrs”
      • Open the registry editor and browse to the location KEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters\Replica Sets  and under that we can see the GUID for multiple DFS shares. We need to select each GUID and on the right side we need to check the key Replica set name and if it shows DOMAIN SYSTEM VOLUME (SYSVOL SHARE) we need to expand the registry key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NtFrs\Parameters\Cumulative Replica Sets and under that we need to go to the same GUID which we found in the replica sets and on the right side right click on the key BurFlags and change the value to D2 and close the registry editor.

        • From the elevated command prompt run the command “Net start Ntfrs”
        • in the event viewer we need to wait for the event id 13516.

           please refer to the article: http://support.microsoft.com/default.aspx?scid=kb;en-us;290762&sd=tech

          Possible causes of  JRNL_WRAP_ERROR

          • Many files are added at once to a replica tree while FRS is busy, starting up, or not running.
            • On a server that is being used for authoritative restore, or as the primary server for a new replica partner, excessive file activity at the start of this process can consume NTFS USN journal records. Size the NTFS volume at 128 MB per 100,000 files being managed by FRS, as mentioned above, to avoid this condition.
              • NTFS needs to be processed with Chkdsk and Chkdsk corrects the NTFS structure. In this case, NTFS creates a new NTFS USN journal for the volume or deletes the corrupt entries from the end of the journal.
                • The NTFS USN journal is deleted or reduced in size.
                  • FRS is in an error state that prevents it from processing changes in the NTFS USN journal.

                    what is JRNL_WRAP_ERROR

                     

                    NTFS maintains a special log called the NTFS USN journal, which is a high-level description of all the changes to files and directories on an NTFS volume. FRS uses this mechanism in order to track changes to NTFS directories of interest, and to queue those changes for replication to other computers. The NTFS USN journal has defined size limits and will discard old log information on a first-in, first-out basis in order to maintain its correct size.

                    If FRS processing falls behind the NTFS USN journal, and if NTFS USN journal information that FRS needed has been discarded, then FRS enters a journal wrap condition. FRS then needs to rebuild its current replication state with respect to NTFS and other replication partners

                    2) Event id 13508 in the event viewer for the other domain controller

                     

                    Event Type: Warning

                    Rule: Warning alert, states the need to look for event ID 13509 to see if everything is working as expected.

                    Message Text:

                    The File Replication Service is having trouble enabling replication from %1 to %2 for %3 using the DNS name %4. FRS will keep retrying.

                    Following are some of the reasons you would see this warning.

                    [1] FRS can not correctly resolve the DNS name %4 from this computer.

                    [2] FRS is not running on %4.

                    [3] The topology information in the Active Directory for this replica has not yet replicated to all the Domain Controllers.

                    Things to check

                     

                    • Check the name resolution between the domain controllers. we are able to ping the other domain controller with the name and the ip address. using Nslookup we should be able to resolve the name of the other domain controllers.
                    • Check the FRS service on both the domain controllers whether it is in started state or not. FRS service should be started state and the start up type should be automatic.
                    • Ad replication between the domain controllers is working fine.
                    • Check the ports using the tool PortQueryUI to check the ports between the domain controllers.
                    • Check the fire wall service. The service should be turned off and should be disabled on windows 2003 server and it should be enabled on the windows 2008 servers
                    • We can check if the Service is not bind to any static port on any of the  domain controllers. we can check it under the registry location KEY_LOCAL_MACHINE\System\CurrentControlSet\Services\NtFrs\Parameters (Key: rpc tcp/ip port assignment).
                    • Check the provider order and the binding orders on both the domain controllers.
                    • We can try Clean boot on the domain controller for which we are getting the event id 13508.(Ex Anti-virus like symantec, Mcfee )
                    • we can check the value of the key Replica Set Tombstoned under the registry location HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NtFrs\Parameters\Replica Sets\ on the DC’s to see if that is 1.
                    • Stale entry for any old domain controllers present in domain.
                    • Take FRSDiag from all the servers and get it analyzed.

                      3) Morphed Folders under < %SYSTEMROOT%>\SYSVOL\domain

                      • If we have morphed folders under < %SYSTEMROOT%>\SYSVOL\domain we can delete any one of them and as the replication is working it will replicate to the other domain controllers

                        NOTE:- Before deleting the folder we have to make sure that which folder have the up-to-date data and then only we have to delete the folder because the folder with the name FolderName_NTFRS_ might have the updated data so we need to be very sure which folder have the updated policies and scripts.

                        Why morphing occurs

                         

                        • A folder is created on multiple machines in the replica set before the folder has been able to replicate. The administrator or program may create duplicate folders on multiple FRS members. This may occur, for example, if the administrator is trying to make data consistent among all members with a manual copy.
                        • You initiate an authoritative restore (D4) on one server and:
                        • You did not stop the service on all other members of the reinitialized replica set before the NTFRS service restarts after the authoritative restore.
                        • You did not set the D2 registry key on all other members of the reinitialized replica set before such a server replicated outbound changes to reinitialized members of the replica set.

                          References:

                          http://support.microsoft.com/?id=328492

                          http://support.microsoft.com/kb/2000056

                          KB 2108772

                          4) Policies and folder missing

                           

                          a)     We need to create the policy and the script folder under the location < %systemroot%>\sysvol\domain and do a non-authoritative restore and change the value of the key BurFlag to D2 at the registry location HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NtFrs\Parameters\Cumulative Replica Sets\ as discussed earlier.

                          5) Netlogon and Sysvol not shared event if the policies and scripts are present

                          a)     we can check the junction points using the following commands to check the junction points, delete them and recreate the junction points in windows 2003

                          Check the junction points

                           

                          linkd “%systemroot%\sysvol\sysvol\dmz.gisnet.net”

                          linkd “%systemroot%\sysvol\Staging Areas\dmz.gisnet.net”

                          Delete the junction points

                           

                          linkd “%systemroot%\sysvol\sysvol\dmz.gisnet.net” /d

                          linkd “%systemroot%\sysvol\Staging Areas\dmz.gisnet.net” /d

                          Recreate the junction points

                          linkd “%systemroot%\sysvol\sysvol\dmz.gisnet.net” “%systemroot%\sysvol\domain”

                          linkd “%systemroot%\sysvol\Staging Areas\dmz.gisnet.net” “%systemroot%\sysvol\staging\domain”

                          In windows 2008 we can use Mklink for crating the junction points

                          Reference: http://technet.microsoft.com/en-us/library/cc753194(WS.10).aspx

                          b) we can check the value of the attributes for frs in adsiedit.msc

                          the attributes we need to check are:

                          DN Path                           ObjectClass

                           

                          DC=A,DC=COM                       Root Domain NC

                           OU=Domain Controllers                   OU Container

                             CN=                                      Computer

                                CN=NTFRS Subscriptions                              NtFrsSubscriptions

                                  CN=Domain System Volume (SYSVOL)                  NtFrsSubscriber

                               

                               CN=NTFRS Subscriptions                               NtFrsSubscriptions

                                 CN=DFSROOT                                         NtFrsSubscriber

                          Common-Name

                          NTFRS-Subscriber

                          System-Must-Contain

                          FRS-Root-Path

                          System-Must-Contain

                          FRS-Staging-Path

                          System-May-Contain

                          FRS-Member-Reference

                          DN Path                            ObjectClass

                           

                          DC=A,DC=COM                                               Root Domain NC

                           CN=SYSTEM,                       Container

                             CN=File Replication Service               nTFRSSettings

                                CN=Domain System Volume (SYSVOL share)              nTFRSReplicaSet

                                  CN=DC1                       nTFRSMember

                                  CN=DC2                       nTFRSMember   

                           

                                CN=DFSROOT                                          nTFRSReplicaSet

                                  CN=DC1                       nTFRSMember

                                    CN=                                       NTDS Connection 

                                  CN=DC2                       nTFRSMember

                                    CN=                                       NTDS Connection

                          Common-Name

                          NTFRS-Member

                          System-May-Contain

                          Frs-Computer-Reference

                          System-May-Contain

                          Server-Reference (SYSVOL only

                          we can also use the script to recreate sysvol which will actually check the value of all the attributes in ADSIEDIT.msc and then rectify it as well. we can get the script from the dspartners link as well and we can use the below articles to check the values manually as well.

                          Reference: –

                           

                          http://support.microsoft.com/kb/296183

                          http://support.microsoft.com/kb/312862

                          How to use FRSDiag

                           

                          It’s very  important to know how to read logs, specially logs like FRSdiag, DFSR Diag etc.  This e-mail can help you to get maximum out of FRSDiag, I will send separate e-mail which can help you to understand DFSR Debug log.

                          FRSDiag runs several test on FRS replica Member for e.g. –  inlog, outlog, repadmin /showreps,constat, etc which  you can run separately using ntfrsutl and repadmin, however this tool has been design to make our life easier J and it runs all necessary test to resolve FRS related issues.

                          Following logs are generated when you run FRS Diag.

                          • connstat.txt
                          • Inlog & outlog
                          • Ntfrs_xxx1.log
                          • Ntfrs_xxx2.log 
                          • Repadmin /showreps
                          • Ntfrs_DS.txt

                            Now let’s talk what these logs contains and how to read them.

                            • Connstat.txt :- This log can be very useful when we are troubleshooting Slow FRS replication issues. This log shows you connection state which means inbound connection stat as well as outbound connection state.

                                    Replica: DOMAIN SYSTEM VOLUME (SYSVOL SHARE) (5a623a91-f107-4c18-914a739758bc7985)

                                    Member: NODE1        ServiceState: 3  (ACTIVE)  OutLogSeqNum: 5054        OutlogCleanup: -1        Delta: 2

                                    Config Flags: Multimaster Primary Online

                                    Root Path   : c:\windows\sysvol\domain

                                    Staging Path: c:\windows\sysvol\staging\domain

                                    File Filter : *.tmp, *.bak, ~*

                                    Dir Filter  :

                                                                                                                             Send           Cleanup     Cos

                                    Partner         I/O   State        Rev      LastJoinTime            OLog State      Leadx  Delta   Trailx  Delta  LMT Out     Last VVJoin

                                     Node2          In

                              Now let’s talk about some important  items which are highlighted in red.

                              LastJoinTime :-  Last successful replication.

                              LeadX :- This value tell us how many updates we have received from outbound partner. For example let’s assume Node1 and Node2 are replication partners, Node1 is sending partner and Node2 is receiving partner. If node1 OutlogSeqNum is 5054 and Node2 Leadx value is 5044 which means Node2  is missing 10 change order.

                              Trailx: – This value indicates change order which is acknowledged. For example if Node2 Trailx vaule is set to 5042 and Leadx is set to 5044 which means node2 know about 5044 change order however it has acknowledged only 5042 changes.

                              State:- State is another important value which you may like to take a look at, possible values could be [Joined, Joining,waitJoin,Deleted. Etc]. FRS is in problem if state is “waitjoin,Deleted”.

                              That completes our first log J.

                              • Inbound & Outbound:- As we know FRS creates inbound log which tell us about change order originated locally or received from another partners and outbound log contains change order that we need to send out to other partners. This log can be useful to figure out time when change order was logged by FRS engine.
                                • Ntfrs_xxx01,02… :- This is interesting log which tell us exact what FRS engine is doing after change order was committed. You may like to take a look at latest log suffixed with highest value (Generally ntfrs_xxx05.txt). Well this log is huge and you need to know where to start from.

                                                   While parsing this log we can use Log Record Identifiers as an filter to reach at exact problematic part. Log Record Identifiers are as follows:-

                                  Identifier

                                  Description

                                  ::

                                  Change order trace records

                                  ++

                                  Continuation records

                                  :DS:

                                  DS access entries

                                  :FK:

                                  FRS registry Key entries

                                  :H:

                                  Log header entries

                                  :S:

                                  Service startup and shutdown entries

                                  :SC:

                                  Service controller entries

                                  :SR:

                                  Send / Receive command server entries

                                  :T:

                                  Tracking record

                                  :U:

                                  USN journal entries

                                  :V:

                                  Version Vector join entries

                                  :X:

                                  Communication entries

                                   When you open log almost 1000 lines are log header and identifier for this entries are “:H:” so technically you should not worry about log header J. Header like “:DS:” can be interesting to look at because that would tell us what DC FRS is talking to poll FRS objects. Sample “:DS:” header as follows.

                                  • :DS:

                                                           :DS: DsCs is starting.

                                                           :DS: FrsDs has started.

                                                           FrsDsFindComputer:             1684:  8796: S2: 23:02:26> :DS: Computer FQDN is cn=node1,ou=domain controllers,dc=eps,dc=com

                                                           :DS: Computer’s dns name is node1.EPS.com

                                                          :DS: Settings reference is cn=ntds settings,cn=node1,cn=servers,cn=default-first-site-name,cn=sites,cn=configuration,dc=eps,dc=com

                                    So by looking at above we know Node1 is taking to itself to poll AD (Node1 is DC here). FRS might look at some orphaned DC to poll AD config if meta of that DC was not cleaned.

                                    • :SR: :- This is another interesting header which you need to look at. This header contains info and FRS change order was either sent or received.
                                      • :X: :- it contains communication entries which tell us about actual send and receive transaction.
                                        • :T: :- Tracking record entries (:T:) can be a helpful way to identify and understand problems that can occur during the change order process. A tracking record entry will tell you what files have been  changed  and where the change originated.

                                              Tracking log looks as follows:-

                                          :T: CoG: d42cda60 CxtG: 000001b7 [RemCo   ] Name: test_file

                                          :T: EventTime: Mon June 9, 2008 08:40:04 Ver: 0

                                          :T: FileG: ceff96a6-5c9f-433a-989c841454a1593b FID: 61a70000 0000036c

                                          :T: ParentG: 1a89f4e1-a0c0-43e4-aedbe869f767f372 Size: 00000000 00000008

                                          :T: OrigG: 2eea81b4-f92d-4941-9f269d4bbdd7ea05   Attr: 00000020

                                          :T: LocnCmd: Create State: IBCO_COMMIT_STARTED          ReplicaName: Replica-A (1)

                                          :T: CoFlags: 0000040c   [Content Locn NewFile ]

                                          :T: UsnReason: 00000002 [DatExt ]

                                          Let’s try to understand some important things here:-

                                          :T: CoG: d42cda60 CxtG: 000001b7 [RemCo   ] Name: test_file :-

                                          CoG:- Change order GUID :- This is a the GUID of change order,

                                           [RemCo  ] :- this indicates it was a remote change order which was received for the file. For local change order it should be[ LcCo].

                                          EventTime :- This is the time when transaction was commited.

                                          [ConFlag] :- This is change order flag like (new file, Move file into different directory, delete folder etc)

                                          USNFlag:-  Reason of trasaction if file for modified not a new file then it would be (Datext) if RenNew – File or folder was renamed , etc.

                                          • Ntfrs_DS :- This log dumps FRS objects from AD. This file can help you to figure out if ntfrs subscription or subscriber object is missing.

                                            Repadmin /showreps :-   I hope all of us know what this log contains

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.