Monday 13 August 2012

Cisco Smart Logging Telemetry (SLT) on a 3560G - Netflow Trial

A while ago, I researched the possibility of using/enabling Netflow on a couple of Cisco 3560G switches we'd purchased.  After much head scratching it was deemed you'd need IOS revision 12.2(58)SE to get any form of netflow - and its not even net flow. Hmmmm....

So... out of hours, I upgraded one of the switches to this new firmware to give it a go.
After a nervous wait for it to restart (Logging in remotely from home) I checked to see if the commands are available - which they were.
So off we go:
Create the exporter
flow exporter test-collector
 description "collection of data for the boss"
 destination 192.168.246.100

This creates an exporter profile in which I can describe what its for and where the data should go.

Then configure smartlog
Conf t
Logging smartlog
logging smartlog exporter test-collector
logging smartlog packet capture size 1024
We then need to give it something to report about.  As a test I created an ACL which permits everything:
access-list 97 permit any smartlog
and then assigned it to an inteface
interface GigabitEthernet0/3
 description Monitoring port for the boss

 ip access-group 97 in

And thats it - pretty easy to setup but to be honest its limitations are immediately visible when using Scrutinizer (or other net/s/flow related applications).  I've just shown you a test example of the ACL here - I amended it later to monitor a specific DENY on a particular protocol on a specific port.

This is netflow but with restrictions.  The data has to be event based and as such - you can view the data when an event has occurred - hence it has been logged.  That in itself limits what you can see.
Don't get me wrong, its far easier than trawling through syslog data and you can drill down into the raw data to at least see some of the packet, but other than assessing security concerns, I'm struggling to see how I'd use it.

In terms of satisfying the original request to view netflow data, I will put the question back to the originator with a politically cryptic hint of sarcasm.... "What netflow data????"

Thursday 19 July 2012

Netflow NoGo on a Cisco Catalyst 3560 switch

In our haste to purchase switches for an already over-run project, it would appear we overlooked the product features of the 3560G. 
I have been asked to enabled netflow on one of the two switches we use in a production environment, but after much head scratching - the 3560 doesn't support it.

Looking further into it, the whole 3000 series doesn't.  Unless you either buy the uplink 10G modules (but then I'm sure it'll only allow you to monitor uplinks) or use a trimmed down version of netflow exporting (appearinging in 12.2(58)SE) from a later revision firmware.  Of course, we are a few releases behind that - but I think it might be worth giving it a go anyway to try and satisfy the request.

So, its a netflow no go.  For now.

Wednesday 13 June 2012

IPSEC over GRE VPN? or maybe its the other way around...

Moan
Its been a while since I've sunk my teeth into something new.  I've had the time consuming tasks of updating CV's and interview skills, for the impending shake up of the organisation I work for.  CCNP study has taken a slight sidewards step (better than backwards) during recent weeks - but I've got a holiday booked shorty and there is no better way to learn than in the sun.

Problem
Anyway, we have the following situation:


Without concentrating on the lines and arrows, there are two sites depicted in the diagram.  The top one (the top blue oblong that covers the LAN and two routers) is what we'd call a hub site.  It is one of 5 sites that all connect together to form the core of the network.  As you can see with the 3 black lines coming from the Cisco 3745.
The other site (the bottom blue oblong) is what we call a remote site.  They typically connect back to their closest hub site, (based on legacy geographical cost) and we support about 60 of these remote sites - so the diagram is obviously cut down.
Both sites have an internet connection.  The router that manages the internet connection is controlled by a third party - so we have no access to it.  It has no firewall though, so we run CBAC on our routers.

The green lines at the remote site show traffic flow.  The 1841 makes the decision (based on static routes) which path to take based on the destination address.  If it is one of a pre-defined IP range for a website, then off it goes out of the local internet connection.  If it is an internal address (server, other remote site etc) then the packets are routed through the default route and back to the hub site via a 2mb serial connection - WHICH IS OVERSUBSCRIBED.

Question?
And thus a question has been asked:

"If the 2mb connection goes down (often), can there be a backup route using the internet connections?"

Also...

"Could the two routes be used in tandem to increase throughput?"

The diagram (if possible) might look like this:


Answer...
The answer is yes. To both questions.  Although I can't see this being implemented.

We already use GRE tunnels (see my previous posts) but that is internal.  With this setup, we'd need some form of encryption when tunnelling through the internet. 

I'll explain the tunnel setup first and just gloss over the other parts.  As I mentioned, this probably wont get implemented and its purely just answering the question to detail a business case to make it happen to then not make it happen.  Does that make sense?  No - thats politics for you.

On with the technical.

IPSEC over GRE/GRE over IPSEC...
Its one or the other, but I don't know/don't need to know which - google is your friend, if you're bothered.  Technically the tunnel would be accomplished using general routing encapsulation, but yet we'd run IPSEC to encrypt GRE packets.  But then, you need IPSEC to create the encryption to run GRE in the first place.  But you need the tunnel to then use the transport protection of IPSEC.  Chicken & Egg, I suppose.

I've configured the remote site router first:

!
crypto isakmp policy 10
  authentication pre-share
!
crypto isakmp key CISCO address 10.66.6.5
!


We define an ISAKMP policy (Good info about what this is and what it does is here) (and give it a priority number) and then instruct the policy to use pre-shared keys as the authentication method for this policy.  Two peers must have a common policy or they wont connect.
We then specify what the key will be, in this example CISCO is used - passwords and IP's are abviously made up for the purposes of this post! We also define the 'other end' I.P address with whom the key exchange will take place.

!
crypto ipsec transform-set Transformers esp-3des esp-sha-hmac
  mode transport
!
crypto ipsec profile MyProfile
  set transform-set Transformers
!


Transform sets are then used (in this case we called ours transformers) to detail what encryption and authentication methods are used for the ESP protocol (Encapsulating security payload) which is what establishes the IPsec tunnel encryption and authentication between the peers.  They must be the same at both ends or the IPSEC tunnel will not form.
The ipsec profile defines the parameters to be used between the two routers which we will reference in the tunnel protection command later.  It references the previously created transform set - You might find this confusing, but remember you can use different transform sets with different profiles and even both of them can be referenced with different ISAKMP policies!  As our example is using a simple point to point, some commands can seem overkill, but they are necessary.

!
interface Tunnel1
  ip address 12.0.0.2 255.255.255.252
  tunnel source 10.99.9.1
  tunnel destination 10.66.6.5
  tunnel mode ipsec ipv4
  tunnel protection ipsec profile MyProfile!

!
And here we have one end of the tunnel.  This one is referenced locally as 'tunnel1'.
The ip address is a completely private range and we've assigned a subnet mask so that only the two peers are useable.
The tunnel source is the interface IP for the interface you want to create the tunnel on.  Similarly the destination is the interface IP of where the end of the tunnel should be.
Rather than using GRE, we specify the tunnel mode to be ipsec ipv4 and then introduce the tunnel protection command to reference the ipsec profile - which in turn references the transform set with encryption and authentication information.

Thats pretty much it.  That creates one end of the tunnel on the remote site router and the process of creating the other end is just a reversal of the IP addresses used above - oh, and the ip address of the tunnel being the other useable one in that subnet!

We configured the above in a test environment and guess what?  It didnt work.

Why it didn't work
I mentioned earlier that we use CBAC (or ip inspect as its referenced) and thus inbound access lists prevent any nasties from the internet making their way into the corporate network.  We had to add a couple of earlier sequence numbers to an extended access list to ensure that both UDP and ESP were permitted from the respective peers.

Like so:

!
ip access-list extended INBOUND
  8 permit udp host 10.66.6.5 eq 500 host 10.99.9.1 eq 500
  9 permit esp host 10.66.6.5 host 10.99.9.1

!

Remember that because the list is INBOUND and applied with the IN statement, the source and destination fields of the ACL must reflect the peer at the other end.  I'm only making this point because I'm used to amending ACL's for outbound access and thus I did it wrong.
Default keepalives were at 10, so we thought we'd adjust them to 5, but its user preference and not the reason why it didn't work.
Be aware:
Show int tun1 - should tell you if both the interface and protocols are up.  Check the logs too, to ensure the tunnels are passing the authentication and encryption stages of the exchange.  Ping the private addresses from each device to ensure connectivity is established.  If you add static routes to test end to end Ip ranges, then ensure you specificy the source interfaces when trace routing or pinging.
Using it as a backup route - question 1
So the question was asked about using the tunnel as a backup.  Given the 2mb serial is referenced with a: ip route 0.0.0.0/0 then we can simply add another static route but with a higher cost, to form a 'floating' static.
Load balancing - question 2
I didn't look into this option (I know, I know) but its simply because I can't replicate it in a lab environment.  BUT... because we use RIPv2 then technically the tunnel would have to have the same hop count to reach the subnets in question as it would using the 2mb circuit.  If so, then by default the Cisco router would load balance with 2 entries in the routing table.  See my earlier posts about process and packet switching to understand how this load balancing would occurr.  It would likely be load balanced per destination and not per packet - even so, with the 2mb serial connection saturated, a little breathing space would be created if only temporarily!

Hope it helps you if you need to create a point to point tunnel - securely.

Monday 16 April 2012

Symantec Antivirus SAV 10.x goes End of life (well EOSL) sort of..

7ish years ago SAV 10.x was at the forefront of corporate AV protection.  I personally thought it was great and we happily upgraded a year after its launch from the 9.x version on our network.

An announcement has been made that it is now going EOSL in July 2012.  Its interesting to note that it will go end of support life rather than end of life - yet the message we have received and seen can be a bit worrying...

Firstly, the use of 'support' life will cause many admins to raise an eyebrow.  Does this mean that we can still get definition files/updates?  if not, we'll have to hastily migrate to the SEP 12.1 product within 3 months - or switch vendors all together.

Secondly, there is no mention of this in the emails sent to customers and/or on the website.  By not announcing eitherway whether definitions are allowed passed July 2012 will potentially scare customers into going with SEP 12 anyway - a clever marketing ploy I think.  This could backfire though, like I said, its a choice to upgrade to SEP 12.1 and its just as much of a change than it would be to switch vendors altogether.

The news from our camp (after contacting Symantec) is that XDB updates will not be available online for SAV 10.x installations.  We rely on manually updating the definition files daily - whilst many employ the LiveUpdate application to do this for them - we're old fashioned and like to see the file download + run.  BUT.... if you continue to use LiveUpdate for SAV 10.x installations after July 2012 then you 'should' still be ok.

Make of that what you will.

I noticed that you can pay to have definitions available beyond July too!  So Symantec would charge you for packaging the update and hosting it for download - when LiveUpdate will continue to work for free?  Money for old rope? - I think so.

Now, many people will question why you're still on a legacy AV product anyway - what with security and all, being a top priority in most organisations.  I do have some sympathy for the folk left behind - I've managed migrations to SEP and it is a task - one that should be project managed if you're doing a large scale deployment and its features and functions are massively different from the 10.x platform to boot.  Smaller organisations might have employed better edge protection and thus have decided to stick with 10.x or there might be financial reasons.  The best case I've heard is of a small business running 10.x because when they paid for contracted resource to install it (in 2008) it has just 'worked' ever since.

I still think that Symantec could have/can do this better.  Their clever/not-so-clever decision to omit the definitions question with either alienate or force the hand (neither make good customer relations).  Why not announce that defintions will continue to be available for 6 months after the EOSL period or tie in with the renewal period of current licence holders?  Sack off the fee for updates idea (that just makes you look greedy) and distribute trials of SEP to current 10.x customers so they can use the 6 months to familiarise?

Anyway, goodbye to SAV - thanks for all the phish(ing) emails you've not protected me against - only joking Symantec.  Can I have a job please?

Wednesday 11 April 2012

EIGRP Route filtering (from CCNP study)

I have managed to mock up the 'work' network using GNS3 = great success.
As such, I can apply the CCNP knowledge learned through study on the work based topology to enhance my comfort levels with the various commands.

Today (and the past few weeks) have been focusing on EIGRP which we do not use at work (from the ICND1 days, you'll know that EIGRP is Cisco's proprietory routing protocol) because we have some HP (peh) equipment at the core.

Route Filtering with EIGRP
Filtering inbound routes by source (using a route-map)

I set myself a small task of filtering EIGRP routing updates on an inbound interface.  I inserted the following commands:

acccess-list 99 permit 192.168.240.2and then:route-map test deny 10
match ip address 99
route-map test permit 20
and then (under the router eirgrp <process number>:
distribute-list route-map test in


Explaining the above: I wanted to block routing updates from a device with I.P 192.168.240.2 listed in the ACL 99 I created first.  I then created a route-map called 'test' and denied anything that matched the ip address defined in the ACL 99.  Route-maps end with an explicit deny, so a permit statement with a later sequence number (20) was added to permit everything else.
I then inserted the distribute-list command referencing the route-map name and specified it as inbound.

Sounds good - but it didn't work! LOL.

The answer was with the match statement in the route-map:

match ip route-source 99

With this in place, routing updates from 192.168.240.2 were not in the routing table - leason learnt.


Filtering inbound routes by what is inside the route advertisement (Using an ACL)

Now lets say we wanted to receive routing updates from the router 192.168.240.2 but within its route advertisement it was advertising a route to 192.168.15.0/24 and we would like to remove that.

Distribute lists do not work with extended ACL's so its best to create an standard IP ACL.

ip access-list standard test
 deny   192.168.15.0 0.0.0.255
 permit any


Like so.  We have created a standard ACL called 'test' and denied our subnet we want out of the routing update.
Under the router process we define the distribute list:

distribute-list test in

To check the ACL is matching packets you can run a show access-list command and the show ip route command should confirm that the route is being filtered.

I will state as a mini-disclaimer that I do not like route filtering inbound unless it is a real requirement for complex networks.  Route re-distribution plays a role in inbound filtering but for most networks the inbound filtering component can be a headache to manage and understand (especially if not documented correctly).  Also in most enterprises a branch site router over some WAN link would want a route filtered before it traverses the WAN link!  otherwise the bandwidth is being used, only to the have the route removed at destination.
The stages of learning early CCNP text reference route filtering on outbound interfaces only - obviously to keep you sane! but also, its far easier to define what is being filtered out than in (in my opinion) at this early stage of learning.

Monday 26 March 2012

Break Sequence to access Rommon on a Cisco Router

I was handed a lovely new laptop by work, but it didn't have a pause/break key.  I needed to perform password recovery on a router we'd have for too long and thus I though I was stuck for a minute.

But you can initiate a break sequence by doing the following:
  • Obviously connect to the router via the console port
  • Open a hyperterminal/terminal session (I use Putty.exe) and set the connection to use serial and a speed of 1200.
  • Power off the router and back on again, whilst pressing the keyboard every second, for 10-15 seconds.  You will just see a load of crap on the screen, but thats not a problem.
  • Close the terminal session (do not power cycle the router or anything) and open a new one with the correct baud speed of 9600.  You may have to press enter, but it'll put you at the Rommon prompt.

Friday 23 March 2012

Robocopy a standard part of the OS!

Gone are the days of downloading the resource kits (or .exe) to use robocopy.  In Windows Vista, 7 and 2008 Server its standard in the OS - just type cmd and robocopy /? and see for yourself.
I have just spent 30 minutes searching our internal sharepoint server for any links to the exe when I then googled it and found this out :(

Limited learning updates recently - simply because the day job has plenty of monotonous tasks that need to be resolved before studying for CCNA Wireless and CCNP ROUTE can commence.

Tuesday 6 March 2012

CCNA Complete

I'm happy to state that ICND2 was passed last week, so the CCNA acheivement I set myself for 2011 - is complete.  Albeit overdue to some degree, but still only by a few months.

CCNP and/or CCNA Wireless look to be the next certifications, books due in a few days.

Hopefully I will have learnt something new to blog about at work, whilst waiting for the new books to show up.

Sunday 5 February 2012

A few bits I forgot.. ICND2

The Boson exam environment software had a few trick questions for me today.

One such question was about being able to ping a computer on a different Vlan.  The question asked if the ICMP echo would traverse one particular switch in the diagram.  I answered NO.  The correct answer was YES.
When you ping from one device to the other and they are on different Vlans, the gateway will be contacted, so follow the path, too and from.  In my wisdom I thought the switch would check its mac table, notice the client in a different Vlan and route it, but switches aren't routers are they?

Another gotcha was to do with OSPF.  Routers can't have mismatched subnets or hello timers.  The network commands can be different (which enables OSPF on an interface) because that is merely ruling out which interfaces should run OSPF via a wildcard mask.

Finally, ISL is Cisco's own proprietry trunking protocol that encapsulates a normal ethernet frame before forwarding over a trunk.  I though it was open standard - stupid me.

Friday 3 February 2012

Cold or Warm Reboot/Reloads on Cisco routers

Unbenown to me, you can warm reload Cisco routers and cut down the boot time.  I read this on Jermey Gaddis' blog from the states.  The guy knows his stuff, but on with the learning...

Have a go:

Conf t
warm-reboot
end

Yes thats it.  You can configure other variables such as allowing the router to warm reboot a few times and when at a certain limit perform a cold restart etc, but the basic command is above.  I'm sure you know how to use the tab and question marks on your keyboard by now...

To then reload the router/switch from the prompt type 'reload warm'.

Obviously there are pro's and con's to this command.  The boot time is reduced, but you need a bit of room in RAM in order to buffer the command. 
The router simply saves the state in RAM and boots from that, rather than all the copying/decompressing the IOS image from flash each time.  I suppose if your IOS is corrupt (er... not sure how many times/if ever I've heard of this) in your running-config, then a cold restart will load a fresh IOS image back on.  But.. lets be realistic, we want quicker booting routers, so this should be a staple command.

Tuesday 31 January 2012

IPv6 Abbreviations - Abbreviating rules

Just as I thought I was mastering IPv6 understanding, the result is FAIL.
I got the removal of 0's the wrong way round.  Take note, unlike I did.

"When abbreviating an IPv6 address
you can remove the leading 0's from within a single octet. 
You cannot remove the trailing 0's.
:: can be used only once in the address
to collapse octets of 0's."

2233:0000:2222:0011:0000:0000:0000:0001
can become:
2233:0:2222:11::1

Friday 27 January 2012

IP GRE Tunnel + PBR (Policy Based Routing)


A little project that has taken up a few days at work recently was: How to route traffic on a remote network out of an internet connection on a local network?  The closest idea that could be invented within the department was to turn off the remote networks internet connection (it is 1/10th the capacity of the local one in question) and let the routing tables take over.
I thought there was a better way and suggested PBR as the answer.  This lead to another network adminstrator and myself drawing up how it would work.  Cue whiteboard and a bit of head scratching.  We only wanted certain traffic to NOT use the remote internet connection - even though it was slower than the local one, it was still perfectly servicable.

I might use local and remote in the wrong context here, but the LOCAL internet connection is the one I'm sat next to and the REMOTE network is at the other side of the city - just for reference!

PBR
The idea to use PBR is to detect traffic on the remote network (matching them by destination and source) and policy routing them to the local internet router.  Whilst this sounds a solid enough idea, PBR relies on the next hop being locally connected (which it definately isn't in our case - the traffic must traverse a core network, approximately 4 hops or so).  Also the core network uses HP Layer 3 switches and this model is not capable of PBR (or a great deal else... but that is another story), so its not like we can simply keep PBR'ing all the way to the local network!  PBR would play its part in matching the traffic and assigning a next hop value, we just needed to create some form of connection that appeared local between the two sites to then assign PBR to it.

IP Tunnel
What we needed to do was create an IP tunnel from the remote network to the local one.  To accomplish this, we could have used a VPN, but encryption on the internal network was overkill, so a simple GRE IP Tunnel was investigated.  It stands for 'general routing encapsulation' and does the job we want of logically creating a point-to-point tunnel between two remote devices.  I must say that it is super easy to setup.  We setup the remote site router to run keepalives, as to not use the connection if the tunnel was down - a cheap form of disaster recovery you could say!  This forced the router to pass traffic through the normal routing table if the other end of the tunnel was offline.

Stuff I forgot about
Whilst setting up the PBR to state a next hop for the 'matched' traffic was easy and the tunnel was also easy to create, It still didn't work.
On the local router the only configuration commands required were that of the B-end of the tunnel.  Nothing more.  It was this oversight that proved to be the problem.
Of course to traverse the router to access the internet we use NAT.  You guessed it already, I didn't add the tunnel interface to be an inside NAT interface - lesson 1.
Lesson 2 was also NAT related.  We use ACL's to determine which traffic can access the internet - the remote site's I.P range was not specified and that concludes our lessons for today.

Learnt today?
All of it really.  I did have an idea about PBR already but didn't know it was only local gateways that could be used as next hops.  I even thought the recursive command would be useful but it turns out that the router performs a recursive lookup for the remote gateway and routes the packet TOWARDS it, rather than encapsulating it etc - this obviously was still an issue with the HP switches.
I also learnt the GRE tunnel isn't intelligent.  The remote ends make no difference to routing decisions if it is down and thus creating a routing black hole.  I though about using an IP Sla and tracking to check the remote interface and base the PBR on that, but keepalives did the job on the remote router where the traffic originates - this keepalive actually affects whether PBR is used or not - news to me.
Also, did you know that when using PBR you are actually process switching?  unless you state (at interface level) - ip route-cache policy.
PBR also was a bit confusing when using a loopback interface to test the tunnel - it didn't work correctly and thus this is because router generated traffic is not influenced by PBR unless you turn it on globally using - ip local policy route-map *map-tag*.

The most important thing learnt today was that working with others has its benefits when drafting up solutions like this.  Brain-storming and bouncing ideas of each other is a form of learning, even if the outcome isn't as expected! 

Monday 23 January 2012

Monitoring a connection via IP Sla, Tracking & Syslog

It's been nearly a week before I managed to learn something new at work.  It encompasses a few bits though!  Here goes:

We have 2 connections at branch sites.  One connection heads back to the core network and the other is a connection to our organisations version of the Internet.  This Internet connection is managed by a third party and we have to log calls to prompt any action if it is suspected it is at fault - we have no access to the router.

Recently it has been at fault, its just proving it!

We statically route a long list of addresses out of the Internet connection on Cisco 1841 routers, rather than them being routed internally.  Its more efficient to use the connection back to the hub site for internal data and the internet connection is funded centrally, so why not use it for itnernet traffic?

The problem seems to be that every 'now and then' user computers will 'hang' for a period of a few minutes, or substantially longer when they are accessing services through this connection.  Our desktop engineers have ruled out the PC's - and I was called in.  The syptoms seem familiar to a home user accessing a web-page with some inbuilt java or scripting, then their connection dies and thus the PC sits there with an egg-timer until the router/connection is restored and off they go again. 

The router and switches that we can manage look fine.  No error messages of any sort and the interface counters seem relatively solid. 

As we use static routing defined like so...:

ip route 0.0.0.0 0.0.0.0 192.168.242.22
ip route 20.130.152.0 255.255.255.0 10.128.99.32

... The defined route will only check its interface and line status to the device known as 10.128.99.32 - which is the 3rd party router connected via a crossover cable next to it.  If the crossover breaks, or both interfaces go down, then the route will be changed to the default route listed above.  Its clear that this isn't happening, the interface counters and syslog don't show it to have dropped and internet traffic hasn't been routed to our core firewalls, so the issue must be on the wider 3rd party connection.  but how to prove that...

We have IOS 15.1(3)T - so I can't vouch for older IOS revisions in terms of the commands used here, just for info.

Find an I.P
First I found a 'pingable' I.P address of one of the systems accessed over the connection.  Its suprising how many of the services we use wouldn't respond, but luckily the most important system allowed ICMP echo's.  You may question how pinging one I.P address will confirm the connection is down - and I'd agree with you that it doesn't.  But... we do access this system from 100+ other sites and it has an up time of 99.9% so, in my book, its as solid an address as I can find.

Configure IP SLA
Configuring the IP SLA manually will allow you to set definable thresholds, frequency and timeout values.  But I did actually stumble across Solar Winds IP SLA monitor.  Its free and therefore its not a polished product that can be tweaked to the endth degree, but it works ok for the task.  Ensure you have write snmp details for the device and that you know the string, otherwise you'll get zero results!
After configuring the SLA through the application, I did have to amend the configuration in the CLI, just to tell the SLA that it should use the interface connected to the internet connection, to do the pinging!
It ended up looking like this:

ip sla 1
 icmp-echo 20.130.152.151 source-interface FastEthernet0/1
 owner SW_IpSla_FreeTool_test
 frequency 30
ip sla schedule 1 life forever start-time now

I like how the application has given itself an owner tag.  In any case, the little desktop application shows a green light - so the SLA is a success at the minute.  Also running show ip sla statistics, proves that to be the case.  The frequency field is defined by the application (boo) and amending it doesn't allow the SLA to be monitored through the application (frequency is how often the ping occurs, in seconds).

Configure a Track
I then setup a track on this object so that we could assess its reachability state and base some actions on it.  The track has to be the easiest series of commands:

track 1 ip sla 1

Show track, gives the state, changes and time of last change etc.

Configuring Event Manager
Rather than looking through the router logs after logging in, we'd like events to be referenced in Syslog so we can view them through our network management software.

Event manager is the way to do this.  You can set actions (such as log to syslog with a message) based on the state changes of our Tracked object.  In turn, we can build a log of how often the internet connection appears to drop.  We can compare this log with another site router (performing all the same tricks as this one) to approach the 3rd party with something comparable - doing their job for them ideally.

Here are the commands to configure event manager:

event manager applet internet-down
 event tag pingdown track 1 state down
 trigger
  correlate event pingdown
  attribute tag pingdown occurs 1
 action 1 syslog msg "*********Internet DOWN**********"


event manager applet internet-up
 event tag pingup track 1 state up
 trigger
  correlate event pingup
 action 1 syslog msg "*********Internet UP**********"

We have two actions based on the 'state' of the tracked object.  One logs the internet as down and the other as being up.  Its important to log when it comes back up, so we know how long the outage was for.
You may also notice a couple of irrelevant commands - such as the trigger and correlate events commands.  Originally during testing I created two event tags for each applet.  This allowed two IP addresses to be pinged (I had two IP SLA's and two Tracks) and the correlate would take both into account - its just a slightly more accurate way to say the connection was down/up basing it on more than one I.P address.  I subsequently removed the IP SLA 2 and Track 2, but left the trigger and correlate there for you to see what is possible with event manager.

And there we have it.  A free desktop application keeps an eye on our IP Sla's as well as logged data recorded from the router.  If we have to go a step further in the future, it would be to tweak the SLA to suit our parameters and then base the routing decisions via PBR (policy based routing) on the reachability of some service I.P addresses (or other SLA probes). 

But for the time being, management just want a log of how often the problem occurs, so in a month or so, I hope we'll have some results - or not... the poor users will be the ones affected if I do have something to show!

Tuesday 17 January 2012

SEPM CPU 90%? - its normal say Symantec

A trusty network administrator has just been on the phone with Symantec regarding a couple of SEPM's (Symantec Endpoint Protection Manager) servers running at 90% CPU load for no apparent reason.

We're using version 12.1 by the way.

Symantec say this is normal if replication is used (which it is).  You probably wont see this load with a stand-alone SEPM installation.
Symantec say that although resource is consumed, its will be released, should the server need any CPU load for other services or processing.  High CPU load witnessed is the SEPM's way of checking its database for consistency.  Seems like a lot of checking to me, but when the server was asked to perform another duty, it did so without exploding - so maybe Symantec are right, it's their product after all...

It would be the case for using the embedded database - which we are using, but Symantec also say that the SQL server (if one was used) would also be running high CPU as it follows the same principles.

CPU queries seem to be flavour of the month at our place!

Cisco IP Interface switching

Whilst studying some text on NAT, I kept noticing segements of text referring to Cisco switching mechanisms.

Here's a line of the text:
"On a NAT tranlsation table, an asterix means that the translation is occurring in the fast-switched path.  By default the first packet in a NAT translation will always be process-switched" - so whats all this fast/process switched business?

Cisco layer 3 devices have three switching modes, Process Switching, Fast Switching, and Cisco Express Forwarding switching.

If you see the configuration line "'no ip route-cache" on an interface the packets entering the interface will be process switched, which means the CPU will do the switching - a potential burden on a router.
Fast switching overcame the issue by route caching the first packet so that others in the same flow didn't have to hit the CPU - these routes were housed in hardware lookup tables that are independent of the CPU.
CEF (Cisco Express Forwarding) builds on fast switching by using FIB (forwarding information bases) and adjacency tables to quickly link packets/flows with routing table entries and adjacent devices.

This is one way of determining what each interface is doing, have a look for the commands:

*Nothing specified in the interface?*
CEF usually enabled by default - you may see ip cef listed earlier in your config
No ip route-cache
Process switched
ip route-cache
Fast switched

That seems to make a bit more sense.

Monday 16 January 2012

Only one IP access-group allowed

A network manager needs to allow some inbound traffic from our version of the internet.  To complete this, we need to review our ACL on a border router.  The device uses CBAC or IP Inspect as it's known in the CLI to allow only internal traffic back through the router (traffic must be generated internally first to be allowed back through the stateful inspection list).  Usually these routers deny everything coming in on the internet facing interface - because there are some nasty folk out there.

Whilst it seems normal to create an ACL for the required ranges for the network manager, after I'd done the hard work in configuring the wildcard masks (they are odd, aren't they?) I then 'forgot' that you can only have one IP access-group either inbound or outbound on an interface.  I say forgot, but I actually forgot there already was one (which said deny to anything inbound) applied to the interface.  When I came to type ip access-group... I thought I'd check to see if there was one already. 

And to summarise, yes there was one.  And yes, the ACL I created was a waste.  Instead I had to append the IP ranges and wildcards to the already configured extended access-list - taking care to ensure the deny statement was at the end of list. 

So, only one IP access-group is allowed on an interface (one for each direction) - this in turn may affect how you setup your ACL's.

Authentication with EIGRP

I found that I learned nothing at the weekend!  Although we did celebrate 10 years of 'courting' (as my Nan would say) by going for a meal - Pigeon is not a very nice meat, but I guess I already knew that, so its not really anything to blog about...

Back to the text books today and thus I have learnt that EIGRP only uses MD5 authentication - only after getting the question wrong on a text exam :(

Friday 13 January 2012

High CPU on Cisco 1841 Router - ARP or SNMP?

And so the learning continues today...

Another network manager has reported an issue with some legacy routers.  The CPU is over 80% and the SNMP engine seems to be partly responsible.  The manager has turned off the SNMP engine to prevent the router from melting - but unsure why the problem has occurred.

Another network administrator has already found some information that may help.  He discovered that the ARP tables on the 'busy' routers were large.  In fact they were extremely large for a remote site/branch router.  50,000 entries on one of them, is some serious amount of Arp requests.

Delving further into the problem it appears that the router is proxy-arp'ing everything.  Running a show arp summary (on slightly new IOS's that some of ours have) give some indication as to which interfaces the Arp'ing was being conducted.  In this case it was a Vlan (the router was using an etherswitch card) that connected the remote site to its main/hub site.  It was also the route for traffic that didn't have a specific route in the routing table - namely a default route (ip route 0.0.0.0 0.0.0.0 vlan128).  The arp table showed requests to sites on the internet as well as internal ranges.

The Cisco documentation regarding troubleshooting high CPU utilization does cover this aspect as highlighted here:

************************************

ARP Input

High CPU utilization in the Address Resolution Protocol (ARP) Input process occurs if the router has to originate an excessive number of ARP requests. The router uses ARP for all hosts, not just those on the local subnet, and ARP requests are sent out as broadcasts, which causes more CPU utilization on every host in the network. ARP requests for the same IP address are rate-limited to one request every two seconds, so an excessive number of ARP requests would have to originate for different IP addresses. This can happen if an IP route has been configured pointing to a broadcast interface. A most obvious example is a default route such as:
ip route 0.0.0.0 0.0.0.0 Fastethernet0/0
In this case, the router generates an ARP request for each IP address that is not reachable through more specific routes, which practically means that the router generates an ARP request for almost every address on the Internet. For more information about configuring next hop address for static routing, see Specifying a Next Hop IP Address for Static Routes.
Alternatively, an excessive amount of ARP requests can be caused by a malicious traffic stream which scans through locally attached subnets. An indication of such a stream would be the presence of a very high number of incomplete ARP entries in the ARP table. Since incoming IP packets that would trigger ARP requests would have to be processed, troubleshooting this problem would essentially be the same as troubleshooting high CPU utilization in the IP Input process.

************************************

So it appears that specifying the next hop as an interface may not be good practice.  Luckily we already have some knowledge of this and have implemented IP addresses of remote router interfaces at most sites (usually wise to base next hop on a reachable IP in-case you have more than one route to that I.P rather basing it on an interface state), but obviously some legacy configs are still around.

It must be that the SNMP engine was then processing all this ARP data and thus adding to the CPU load for an already busy router processing arp requests!

By changing the next hop interface to an I.P address, the router calmed and normal service was resumed - although there is still some monitoring to be done to see if this is a long term cure.  One suspects that turning proxy-arp off, is not only a good security lesson, but may help the router out in future.

SEP 12.1 RU1 - Pull or Push?

So... the first bit of information I learned today was at work. 

Another network administrator has informed me that the default client/server communication method in the RU1 update for Symantec Endpoint Protection version 12.1 (yes it is a mouthful - and a bitter tasting one, I'll explain one day) uses Pull mode rather than Push as default.
Apparently push mode means that client/server comm's is constant, so that any policy changes etc can be instantly 'pushed' to clients.  Not ideal in large networks, so they've changed that to pull, where changes are sent after the hearbeat cycle.

Now, I didn't know that, until now.