Friday, April 30, 2010

INE Workbook Vol 1 IPv4 Multicast

Well, on to IPv4 multicast which I'm learning pretty well. This is about my fourth foray into IPv4 multicast. My troubles come with IPv6 multicast, which comes later. Here are some of my notes/scenarios regarding multicast...

You can easily troubleshoot multicast RPF failures using 'debug ip mpacket' in conjunction with 'no ip mroute-cache' on all multicast interfaces in the path. In short, multicast RPF failures happen when multicast traffic follows a different path than the unicast traffic. Fixing it is as easy as adding a host mroute for the source, and pointing it to the correct interface/next hop.

Here on R5, the Incoming interface is null for R6.

(155.1.146.6, 224.10.10.10), 00:00:54/00:02:05, flags: 
  Incoming interface: Null, RPF nbr 155.1.45.4
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:00:54/00:00:00
    Serial0/0, Forward/Dense, 00:00:54/00:00:00

Obviously this is not correct. The unicast is pointing to the PtP link between R4 and R5, which is not PIM enabled.

R5(config-if)#do sh ip route 155.1.146.6
Routing entry for 155.1.146.0/24
  Known via "eigrp 100", distance 90, metric 2172416, type internal
  Redistributing via eigrp 100, ospf 1
  Advertised by ospf 1 subnets
  Last update from 155.1.45.4 on Serial0/1, 00:08:14 ago
  Routing Descriptor Blocks:
  * 155.1.45.4, from 155.1.45.4, 00:08:14 ago, via Serial0/1
      Route metric is 2172416, traffic share count is 1
      Total delay is 20100 microseconds, minimum bandwidth is 1544 Kbit
      Reliability 255/255, minimum MTU 1500 bytes
      Loading 1/255, Hops 1

Let's fix that. Since the PtP is not multicast enabled, we will add a route that forces multicast traffic for 155.1.146.6 to transverse the FR which is multicast enabled.

R5(config)#ip mroute 155.1.146.6 255.255.255.255 155.1.0.4

And test/verify.

R6#ping 224.10.10.10


Type escape sequence to abort.
Sending 1, 100-byte ICMP Echos to 224.10.10.10, timeout is 2 seconds:


Reply to request 0 from 155.1.108.10, 76 ms


R5(config)#do sh ip mroute
......


(155.1.146.6, 224.10.10.10), 00:00:18/00:02:46, flags: T
  Incoming interface: Serial0/0, RPF nbr 155.1.0.4, Mroute
  Outgoing interface list:
    FastEthernet0/0, Forward/Dense, 00:00:18/00:00:00

Well that looks better!! Moving on...

I got a little tripped up by some sparse-mode RPF failures. I added a static mroute which fixed the issue, but the issue can also be caused by differences in unicast IGP routes. For instance, when setting the the RP on R5 Loopback0, the issue is caused because OSPF announces the routes as /32 (to R6) but R4 (the router that sends the RP register message) sees the route as /24 in EIGRP. Simply adding 'ip ospf network point-to-point' on R5 Lo0 fixes the problem. Now both EIGRP and OSPF routers see 150.1.5.0/24.

The PIM Assert procedure dictates who will flood the multicast traffic on a shared segments. The Assert procedures takes into account the AD and the metric, and if all else fails, the highest IP address. So if two ethernet-connected routers have different IGPs, the one with the lowest IGP AD will be the assert winner. I misinterpreted the task and adjusted the DR, which deals with is responsible for multicast source registration and not multicast flooding. Ugh....You should be able to 'show ip mroute x.x.x.x' and see the A flag. You can also 'debug ip pim' to see the Assert messages. Unfortunately, none of these were working for me. First, I thought it was because we switched to SPT trees. Easy enough, set the spt threshold to infinity. That still didn't work. Clear IP mroute. nope. Reboot routers, nope. I could not get anything to work. R4 didn't prune the traffic, none of my routers on the shared segmet showed the A flag, and debug ip pim never showed assert messages. I need to do some more research on this one...perhaps it's dynamips, or maybe my IOS version....moving on....

To limit (*,G) towards an RP address, specify the following:

ip pim accept-rp 150.1.5.5 5
access-list 5 permit 224.10.10.10
access-list 5 permit 224.110.110.110

This will limit groups 224.10.10.10 and 224.110.110.110 for rp 150.1.5.5. This does not affect SPT joins. This is different than the accept register, which limits the sources that can register with the RP and thus this is configured on the RP.

When doing AUTO-RP, you can load-balance and provide redundancy. This is possible using longest match. For instance, we can select SW2 as the RP candidate for groups 224.0.0.0/5 and SW4 as the RP candidate for groups 232.0.0.0/5. Here are the pertinent access-lists on SW2 and SW4.

SW2(config-std-nacl)#do sh ip access-list RPs
Standard IP access list RPs
    30 deny   224.110.110.110
    10 permit 224.0.0.0, wildcard bits 7.255.255.255
    20 permit 224.0.0.0, wildcard bits 15.255.255.255
SW2(config-std-nacl)#


SW4(config-std-nacl)#do sh ip access-list RPs
Standard IP access list RPs
    30 deny   224.110.110.110
    10 permit 232.0.0.0, wildcard bits 7.255.255.255
    20 permit 224.0.0.0, wildcard bits 15.255.255.255

By way of the second entry, both candidates can be the RP for all groups (except 224.110.110.110) in the event the other fails.

In working with BSR, you can load-balance based on the hash value. Now, to figure this out requires some mathematics, but by default the group IP address is ignored, making this rather easy. If you select a hash of 31 (255.255.255.254), we will load balance between even and odd. Evens will go to one RP, odds will go to another. Now, they could make this very tricky but computing the hash based on a specific group address. 'show ip pim rp-hash x.x.x.x' will show you the hash values for a particular group as such...

R4#sh ip pim rp-hash 239.1.1.2
  RP 150.1.8.8 (?), v2
    Info source: 150.1.5.5 (?), via bootstrap, priority 0, holdtime 150
         Uptime: 00:04:14, expires: 00:02:25
  PIMv2 Hash Value (mask 255.255.255.254)
    RP 150.1.10.10, via bootstrap, priority 0, hash value 1093093598
    RP 150.1.8.8, via bootstrap, priority 0, hash value 1364246456

Even groups are sent to 150.1.8.8 based on the hash value....

R6#sh ip pim rp-hash 239.1.1.1
  RP 150.1.10.10 (?), v2
    Info source: 150.1.5.5 (?), via bootstrap, priority 0, holdtime 150
         Uptime: 00:03:28, expires: 00:02:09
  PIMv2 Hash Value (mask 255.255.255.254)
    RP 150.1.10.10, via bootstrap, priority 0, hash value 989207280
    RP 150.1.8.8, via bootstrap, priority 0, hash value 718054422

Odd groups are sent to 150.1.10.10 based on the hash value.

The next section required that R6 not learn any RP information. I though of multicast boundary. Good news! With BSR - it's even easier.

'ip pim bsr-border' on both R1 and R4. Now R6 will not hear any RP announcements!

To setup multicast stub routing requires the use of the ip igmp helper-address command. Additionally, set a neighbor filter on the pertinent interfaces so that  downstream devices never become PIM neighbors and enable dense mode on all downstream interfaces.

Bi-dir PIM can be enabled pretty easy. It's a simple concept. Bi-dir multicast means many devices will be senders and receivers. You will need a bidir capable RP (through static, BSR or autorp) and enable bidir pim on all your multicast routers. Keep in mind that there are no register messages in bidir pim because devices can send or receive at anytime (think videoconferencing).

DVMRP! I had no idea this was even on the blueprint, but it's covered in Volume 1.DVMRP functions much like RIP and uses a hop count metric. This information is only used for RPF checks and not actual routing which remains with PIM/IGP routing protocols.

First, enable DVMRP interoperability within the router, and then enable the interfaces for DVMRP.


R4(config)#ip dvmrp interoperability 

By default, the router will only advertised directly connected subnets. To advertise more information, use the metric command with an access-list and specify the IGP protocol. A metric of 0 will filter out those updates.

R4(config)#access-list 40 permit 155.1.0.0 0.0.255.255
R4(config)#interface f0/1
R4(config-if)#ip dvmrp metric 1 list 40 eigrp 100

You can create a tunnel to the DVMRP cloud. You cannot tunnel between two IOS routers.

R4(config-if)#int tun0
R4(config-if)#ip unn l
R4(config-if)#ip unn lo0
R4(config-if)#ip pim dense-m
R4(config-if)#tunn so lo0
R4(config-if)#tunn dest 204.12.1.100
R4(config-if)#tunn mod dvmr

Now you won't be able to test this in the actual lab or your lab but you can debug  with 'debug ip dvmrp detail' to verify DVMRP updates are being generated.

Ahh....multiprotocol multicast using BGP. This simply involves enabling the ipv4 address-family under BGP. The multicast AF will have separate policies than the unicast AF like AS prepend, metric, etc.

I found a very important troubleshooting command. 'show ip rpf x.x.x.x' will show RPF failures for a particular source address. 'debug ip pim' can helps identify possibly RPF failures. As you also know, mtrace can usually pinpoint exactly where an RPF failure occurs. Also, show ip mroute 224.x.x.x count can show if packets are being received/dropped. I highly suggest reading IP Multicast Troubleshooting. Great read.


To wrap things up is MVR. MVR is a special type of multicast delivery suited for metro Ethernet networks. MVR uses a single dedicated VLAN across the whole ring to deliver multicast feed to all receivers. The actual receivers are in different VLANs but the switches intercept the multicast traffic (IGMP Joins) and pull the multicast feed from the MVR Vlan to the receiver VLAN. This is pretty easy to setup. Here are the pertinent configs.


SW1(config)#mvr
SW1(config)#mvr vlan 146
SW1(config)#mvr group 239.1.1.100
SW1(config)#mvr mode dynamic
SW1(config)#int fa0/1
SW1(config)#mvr type source
SW1(config)#int fa0/5
SW1(config)#mvr type receiver


An important note, you can have no more than 256 groups configured for MVR and they should never alias. Meaning that two multicast IP addresses map to the same multicast address. So you can't enable MVR for 228.1.1.1 and 230.1.1.1. So based on the config, you can see that we enable MVR for Vlan 146 using dynamic mode. Traffic received on Fa0/1 will be delivered to R5 on Fa0/5.


Verify with 'show mvr', 'show mvr interface', 'show mvr members' and 'show ip igmp groups'. Disable mroute cache and debug ip mpacket. You should see traffic arriving from the source interface.


Finally, igmp profiles. This works much like a route map or a vlan map. Should be self explanatory.IGMP profiles apply to transit messages - IGMP messages not destined for the switch directly.


ip igmp profile 1
permit
range 232.0.0.0 232.255.255.255
range 239.0.0.0 239.255.255.255
!
interface fa0/4
ip igmp filter 1


Well, it took me incredibly too long to go through the multicast volume but it's been difficult to find study time with all the traveling and work that I do. In addition, multicast is one of my weaker topics and I plan to re-visit this volume if I have enough time. There are only 10 weeks remaining until my lab attempt...time to get cracking...

Tuesday, April 6, 2010

INE Workbook Vol 1 BGP part Duex

'bgp always-compare-med' should be consistent across the AS. This whole always compare MED is still something that gets me. I created a loopback in both AS100 and AS300 with an identical IP address. In the INE workbook, we need to influence AS 200. Within AS 200, we obviously set 'bgp always-compare-med' to make it consistent across the AS. Then from the neighboring AS's we match the route (the loopback address) and set the metric appropriately. We want to prefer AS 300 over AS 100 so obviously you set the metric/MED higher on AS 100 and clear your bgp outbound policies. Here is the pertinent configs from AS 100 and AS 300. This needs to be present on all routers in AS100 and AS300.

ip prefix-list Lo1 seq 5 permit 1.2.3.4/32

route-map To_R3 permit 10
 match ip address prefix-list Lo1
 set metric 1000
route-map To_R3 permit 20

router bgp 100
 neighbor 155.1.13.3 remote-as 200
 neighbor 155.1.13.3 route-map To_R3 out

And now AS300

ip prefix-list Lo1 seq 5 permit 1.2.3.4/32




route-map To_R3 permit 10
 match ip address prefix-list Lo1
 set metric 100
route-map To_R3 permit 20

router bgp 300
 neighbor 155.1.37.3 remote-as 200
 neighbor 155.1.37.3 route-map To_R3 out

Poof! Everything worked! Yay! I just need to wrap my head around this. It's not difficult, it just necessary to be consistent and configure all pertinent routers. Phew. Moving on....

DMZ Link bandwidth is another of those topics that is not difficult to understand or implement. It's just a matter of putting all the pieces together. First, you need to enable dmz-link bw under both the BGP process and for the ebgp neighbors.

R6(config-router)#router bgp 100
R6(config-router)#nei 54.1.1.254 remote-as 54
R6(config-router)#nei 54.1.1.254 dmzlink-bw
R6(config-router)#bgp dmzlink-bw

That was easy enough - but you need to understand how this works. BGP inserts the bandwidth into an extended community to send to it's neighbors. So that means we need to enable send-community to our iBGP neighbors within the relevant AS.

R6(config-router)#router bgp 100
R6(config-router)#nei 155.1.146.1 send-comm ext

Ok, now we should be collecting bandwidth and sending the data to our peers. BGP does not by default load balance across unequal paths. So now we need to enable that within our AS.

R1(config-router)#router bgp 100
R1(config-router)#maximum-paths ibgp 2

Now we will load-balance across unequal paths within AS 100 based on bandwidth to links in AS54. So now if we view our routes in AS100, we will see multiple paths.


R1(config-router)#do sh ip route | i 204.12.1.254|54.1.1.254
B    119.0.0.0/8 [200/0] via 204.12.1.254, 00:06:06
                 [200/0] via 54.1.1.254, 00:06:06
B    118.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    117.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    116.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    115.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    114.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    113.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B    112.0.0.0/8 [200/0] via 204.12.1.254, 00:06:07
                 [200/0] via 54.1.1.254, 00:06:07
B       28.119.17.0 [200/0] via 204.12.1.254, 00:06:07
                    [200/0] via 54.1.1.254, 00:06:07
B       28.119.16.0 [200/0] via 204.12.1.254, 00:06:07
                    [200/0] via 54.1.1.254, 00:06:07


Take notice that this affects the routing table and not the BGP table as BGP will always select a best route over all available paths, but you will see multipath in the route details.



R1(config-router)#do sh ip bgp 28.119.16.0
BGP routing table entry for 28.119.16.0/24, version 47
Paths: (2 available, best #2, table Default-IP-Routing-Table)
Multipath: eBGP iBGP
  Advertised to update-groups:
        1    2
  54, (Received from a RR-client)
    54.1.1.254 (metric 2560002816) from 155.1.146.6 (150.1.6.6)
      Origin IGP, metric 0, localpref 100, valid, internal, multipath
      DMZ-Link Bw 250 kbytes
  54, (Received from a RR-client)
    204.12.1.254 (metric 2560002816) from 155.1.146.4 (150.1.4.4)
      Origin IGP, metric 0, localpref 100, valid, internal, multipath, best
      DMZ-Link Bw 12500 kbytes


If you see something in a scenario saying you should only accept routes from directly connected ASes, but you can't use AS-Path filtering, it is a simple command. This is where having exposure to every detail of a protocol will help you.

router bgp 100
bgp maxas-limit 1



You can selectively delete communities received from a neighbor, while still adding your own. First you create an expanded community list, using regex to match the communities you want to delete. Here, we are matching any community that starts with 200:


SW1(config)#ip community-list expanded TST permit 200:[0-9]+_

Now create a route-map, add any communities you would like to add in this AS, and then delete the selected communities and apply the route-map.


SW1(config)#route-map SetComm permit 10
SW1(config)# set comm-list TST delete
SW1(config)# set community 300:200 additive
SW1(config)#router bgp 300
SW1(config-router)#nei 155.1.37.3 send-comm both
SW1(config-router)#nei 155.1.37.3 route-map SetComm in

Here is the end result.


SW1(config)#do sh ip bgp 222.22.2.0
BGP routing table entry for 222.22.2.0/24, version 80
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Flag: 0x880
  Advertised to update-groups:
     1          2         
  200 254
    155.1.37.3 from 155.1.37.3 (150.1.3.3)
      Origin incomplete, localpref 100, valid, external, best
      Community: 254:100 300:200




For comparison, here is the same route from R2 in AS200 (which send routes to SW1) and from R6 in AS100 (which SW1 send routes to).



R2(config)#do sh ip bgp 222.22.2.0
BGP routing table entry for 222.22.2.0/24, version 99
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
        2
  254
    192.10.1.254 from 192.10.1.254 (222.22.2.1)
      Origin incomplete, metric 0, localpref 100, valid, external, best
      Community: 200:123 200:254 254:100



R6(config)#do sh ip bgp 222.22.2.0
BGP routing table entry for 222.22.2.0/24, version 48
Paths: (2 available, best #1, table Default-IP-Routing-Table, Advertisements suppressed by an aggregate.)
Flag: 0x880
  Not advertised to any peer
  (65014) 200 254
    155.1.13.3 (metric 27260160) from 155.1.146.1 (150.1.1.1)
      Origin incomplete, metric 0, localpref 100, valid, confed-external, best
      Community: 200:123 200:254 254:100
  300 200 254
    155.1.67.7 from 155.1.67.7 (150.1.77.77)
      Origin incomplete, localpref 100, valid, external
      Community: 254:100 300:200





Another fun scenario was using extended access-lists to match a range of subnets, something that would be much simpler with prefix-lists, but alas, can also be accomplished with extended access-lists. In short, the source in your access-list would be what to match in the network, with the destination matching the subnet mask.

R4(config)#access-list 104 deny 0.0.1.0 255.255.254.0 255.255.252.0 0.0.3.255
R4(config)#access-list 104 permit ip any any

This would deny any route with an odd third octet, with a /22 mask or larger up to a /24. I already know how to match networks based on access-lists, but matching the masks with an access-list is something new. Looking at it, it makes total sense, but something I need to see a few more times to really get a handle on.

Moving on, I stumbled across bgp dampening. Something I always though was pretty easy, but INE presents it differently. "Once a prefix flaps two times in a row, the advertisement should resume in 5 minutes". Most scenarios will specify the actual dampening parameters. Here is the actual formula:

P (t) = P (0) /2^(t/Half_life)

We know that the time we want is 5 minutes.

P(5) = P(0)/2^(5/Half_life)

Now substitute P(5) = 750 (the reuse limit) and P(0) = 2000 (suppress limit). The equation becomes:

200/750=2^(5/Half_life)

From this equation, take the logarithm of the both sides.

Half_life=5*Ln(2)/Ln(200/75) = 3.5 (approx.)

We may now round up to 4. Max suppress time is by default 4 x half_life, giving us the following:

router bgp 200
bgp dampening 4 750 2000 16


Wow. Am I the only one lost? It's been a number of years since I've taken an advanced math class, so I'm guessing that is why I am lost. I'll review this again later, but I would hope Cisco doesn't expect all candidates to be up on advanced math.


The default bgp scanner time is 60 seconds. Keepalive is defaulted to 60 seconds with the holdtime being 180. The BGP scanner is responsible checking validity and reachability of BGP NEXT_HOP attributes, performs conditional advertisement and route injection, imports new routes into BGP table from RIB and performs route dampening. If you receive a scenario to configure BGP to do route injection every 20 seconds, you will need to alter the BGP scan time. Also, setting the advertisement-interval to 0 will remove batch updates and advertise immediately. 


You can adjust the bgp next-hop trigger timers. Remember the BGP scan time is responsible for validity and reachability of the NEXT_HOP attributes. If we use the next-hop trigger timers, we can tune this to be more in line with IGP timers.

That finally wraps up BGP. This took me a while, but I wanted to take my time going through BGP, knowing that it is one of my weaknesses. More to come later from Vol 1.....