Common ISP Mistakes
This page lists some common mistakes made by ISPs (or other network
administrators) which cause other people grief. Some of these
apply to backbone providers, some apply to non-backbone providers,
and some apply to enterprise/organization network administrators.
You may want to use this as a guide towards "best practice".
Some of the sections discuss things which you can do wrong and
some suggest things which you can do right. The two are often
not clearly separated; I trust people will be able to distinguish
the two. A Little more consistency would be nice.
Advertising routes to 0.0.0.0/0 via BGP.
This causes all (or much of the) traffic on the internet to be routed to you,
denying service to you and everyone else. This is likely to be caused
by specifying a default route on a machine running BGP.
Blocking ICMP
Many ISPs, and enterprise networks have blocked some or all ICMP traffic,
in an attempt to provide permanent or temporary protection against DoS
attacks. Blocking ICMP "too big" messages will break PMTU discovery.
If you must block ICMP "echo" and "echo reply" as a temporary stop gap
measure, block only those message types (and preferrably from
as small a subset of the internet as possible). While you are blocking
these types, it is a mistake to claim that the network is "up". The
network is operating in a degraded mode.
Breaking ping is a bad thing, only to be done in emergencies.
Breaking PMTU discovery has the effect that connections between any two
hosts that have an MTU greater than that of the smallest MTU hop on the
path will not be able to communicate. Packets will be dropped,
consistently enough to prevent communications even though some small part
of each flow will frequently get through (generally the same part). This
is a pattern (packet size) sensitive packet loss not a random one so TCP
retransmission does not recover. For SMTP, the HELO, MAIL, RCPT dialog
will happen and then the connection will hang on the message DATA (unless
the message is very short) tying up the servers at both ends until they
timeout. Normally all attempts to resend the message will also fail.
For HTTP, the browser will probably be able to do a "HEAD" (short
response) but a "GET" will fail. The symptom to the user is that
they are consistently unable to get through to certain web sites
with the connection stalling at the same place each time. On very rare
ocassions, I have seen a connection actually succeed to one of these
unreachable servers when it was sufficiently loaded that it transmitted
the data in smaller chunks.
If this happens to you, a workaround is to set the Max MTU to 576 on each
of your clients (to fix outbound connections) and servers (to fix inbound
ones). Setting the Mtu on your router does not seem to work (it might
help in one direction (of data flow), by sending ICMP too bigs to your
inside hosts at a threshold lower than the lost ICMP too bigs, but not the
other direction in which both sets of "too bigs" get dropped. This does
put a higher packet load on the backbone (more smaller packets have to be
routed). The cause is usually because a router somewhere drops packets
which are too large but have the DF (don't fragment) bit set without
generating the required ICMP too big message (there is some defective
hardware out there) or, more likely, that some cluelesss network operator
filtered all ICMP traffic, usually as a naive attempt to protect against
real or anticipated DoS attacks.
This is made much worse by filtering software which does not allow
you to filter specific ICMP types and (unfragmented) packet sizes.
A much better way to handle most of the DoS attacks (except smurf)
is to force fragment reassasembly on a router which is not sucseptible
before forwarding the packets; this puts a significant load on the
router so it is best done on a router/firewall close to the
systems being protected (which is also desireable because it
gives more protection).
Note that PMTU can be broken at least three different ways, if
you are suffering from it: Failure to generate ICMP "too big",
filtering ICMP "too big", and configuring equipment which handles
traffic which may come from the public internet with private
addresses on the routers.
Marc Slemco has a page on
Broken PMTU Discovery.
Using defective hardware that drops oversized packets without
sending ICMP "too big" messages
This breaks PMTU discovery, among other things.
Some "repeaters" which bridge incompatible (from an MTU perspective)
media types (like fddi to/from ethernet) are guilty of this offense.
A repeater is generally a brain dead device which simply cannot
generate ICMP messages. These devices should be avoided. They
might be usable in certain carefully controlled situations (where
every single device on the larger MTU side can be, and is, programmed with
the smaller MTU.
Filtering Broadcast addresses incorrectly
Filtering on 0.0.0.255 mask 0.0.0.255 is also irresponsible.
The lame arguments that anyone who has a host in that range is
asking for trouble are specious; just because they may be adversely
affected by some clueless individual somewhere does not justify
your being clueless as well. Yes, personally, I would avoid putting
a (externally accessable) host at .255 because of the general clue
deficit.
Failure to filter traffic with forged IP source addresses
This is the single most important thing you can do to protect against
net abuse originating within your network.
On some cisco's, you may find the "ip verify unicast reverse-path" command
to be helpful. On almost any box which has IP filtering capability, you can
filter out packets from any interface which do not match the IP
address range assigned to that interface.
Blocking traceroute
Traceroute typically uses udp ports 33435 to 33524 for the first 30 hops
(for additional hops beyond that add 3 ports per hop). You need to
allow these through firewalls or packet filters. Do not allow any
vulnerable servers to use this port range inside your net.
Don't allow your downstream networks to be used as smurf amplifiers
On cisco's, you can use "no ip-directed broadcast". On almost any box
with IP filtering, you can deny any packets which originate
on an outside interface and have an address matching your inside
network(s) broadcast addresses. You should do a smurf scan
on your downstreams and block traffic to those broadcast addresses
which are found to result in smurf amplification (if a ping to
that address results in multiple replies ("DUP!"), you should block
traffic to that specific address until your downstream can do it themselves).
Configuring router interfaces with Private IP Addresses
This breaks things because many networks, legitimately prevent
any packets with a source address in one of the Private blocks
from entering or crossing their networks. Any ICMP responses
from the router (such as "TTL exceeded" or "too big") will
have the illegal Private address. This can break traceroute,
PMTU discovery, and others.
Reassembling packet fragments
There are two reasons one might want to do this: improving perfomance
in certain (unusual) situations (while putting a huge load on the routers) and
for security.
RFC-1812 specifically forbids a router from doing this because
the fragments may take different routes and therefore not be
all be availible for reassembly, thus causing packet loss which
may be pattern sensitive and not random and therefore not fixed
by retransmission. However, RFC-1812 was clearly written taking
into consideration only the dubious performance reasons for doing this.
The security reasons are far more compelling and I think RFC-1812
will need to be rewritten in the future to permit reassembly
under certain limited situations, namely: A) where the router performing
reassembly is on the only possible route into or out of the
network being protected, or B) where steps have been
taken to force all fragments from any given packet to converge at a
common point for reassembly (be carefull that you don't inadvertantly
cause fragmented packets to appear to have originated from inside
your protected network in the processs).
Failure to read RFC-1812
RFC-1812 is the current Router Requirements RFC.
Failure to document any deviations from RFCs
If you absolutely must deviate from any RFCs or best practice,
these should be clearly documented on your web site in a
easily found location.
Failure to properly delegate reverse address mappings
Some ISPs do not know how to properly delegate reverse address
mappings to a downstream customers domain name server for network
blocks smaller than /24. The O'Reilly and Associates DNS book
devotes several pages to how to do this. This failure interferes with
the downstream networks administration of their address space.
Failure to prevent SPAM originating within or relaying through your net
Disable forwarding of messages through your smtp servers unless they come
from or are destined to your network or downstream networks. Spammers
misuse other peoples smtp servers to handle delivery for them, allowing
them to send large amounts of spam from even a slow link and making
it harder to trace and block spam.
Enforce your acceptable use policy.
Failure to have, and enforce, an Acceptable Use Policy
Make sure your Acceptable Use Policy gives you the power to terminate
accounts which engage in net abuse. Some spam prevention
web sites have links to examples of AUP which are used
by other ISPs to prevent abuse.
Require proper identification for all prospective network
customers (both dedicated circuit and dailup). Require
all prospective customers to sign a statement that they will
abide by the AUP and that they have never had access
terminated before for any type of network abuse.
At a minimum, run all credit cards through with name/address verification
before granting access. Check all credit cards, names, addresses, and
phone numbers against known abusers. Verify that the prospective
customer can actually be reached at what they claim is their phone
number. Publish the name, address,
phone number, and the MD5 hash of the credit card number used by
any spammer to the net-abuse newsgroup.
Although AOL does enforce its AUP after the fact (at least by terminating
accounts, if not taking perpetrators to court), it is their
promiscuity in signing up customers that leads to so much spam
from their network.
If you do not want to bother with the hassle of suing perpetrators
to recover damages, at least assign your rights to collect damages
to a law firm which will sue on their own behalf. Write your
AUP to make lawsuits lucrative by establishing minimum damages.
Your AUP should also allow you to monitor suspected net-abuse.
If someone sends 1000 copies of an email message, you should be
able to read it to see if it is SPAM (if they are sending
1000 copies, it probably isn't that private anyway).
Your AUP should differenciate from unauthorized security attacks
and authorized security attacks by reputable security consultants
or the target networks owner themselves.
Transparently proxying traffic without consent
It is completely unethical, and probably illegal as well, to transparently
proxy traffic that transits your network without the consent of
at least one of the two parties to that traffic, except to provide
informative rejection messages in the case of traffic blocked because
of net-abuse. Digex was,
until recently, considered a reputable backbone provider but in June
of 1998 it became public knowledge that they were transparently
caching web traffic; now their name is mud.
Internet packets are not fungible; you can't treat them like lumps of coal.
Transparent proxying or caching can be used to:
- Server as a web accelerator for downstream customers who
have explicitly requested in.
- Improve performance for downstream customers provided
you have notified them of the transparent caching and
provided a means (such as a web form) by which they
can configure this on or off at will for their accounts.
- You are providing an informative error message for
connections to which you would otherwise deny service silently.
- You can force caching of your downstream customers outgoing web
requests without option to disable only if you are
explictly selling substandard service.
Users of transparent caching should also beware that they can only legitimately
do so if all tcp packets for a connection must travel through one point;
otherwise, you have the same kinds of problems as with fragment reassembly.
Making changes without notifying affected parties
It is your responsibility to notify your downstream contacts if
you do any of the following:
- Filter traffic to/from their network in any way.
- Deviate from any RFC on any of your routers, particularly RFC-1812
- Change the point of attachment for their connection.
- Change any of the above
Failure to adequately monitor for net abuse
- SPAM
Any host or account on your net which sends excesive mail
(total number of recipients for all messages) during a
given period of time should trigger an alarm. In response
to that alarm, you can investigate if it is spam or a legitimate
mailing list.
- Ping of Death and other similar DoS
- Smurf
Monitor for inbound attacks, outbound attacks, and attacks
which are using your network as an amplifier.
- Any forged source address packets
Failure to document any deviations from RFC's
If you deviate from any RFC's or filter traffic in any way, you
should document that on an easily located web page on your web site
dedicated to troubleshooting network problems. This page should
have links to/from your downtime page if they are not the same
and should have a link directly from your organizations main web page.
Failure to consider who is affected by your countermeasures
Countermeasures to various types of attacks should be designed
to minimize the affect on innocent people and maximize the
effect on the perpetrators or those who have helped the perpetrators
by contributory negligence. If you are smurfed, blocking all
traffic from the "smurf amplifier" sites is better than blocking
any type of traffic (such as ICMP) from the whole net.
If you have to block some traffic, it is also a good idea to consider
redirecting http (and possibly other common protocols like SMTP and FTP)
to protocol specific servers which return suitable error messages
referring people to "http://accesssdenied.your.net/". This minimizes
the time wasted by people who may not be at fault, cuts down on
calls to your tech support lines, and directs hostility towards
those who are at least partially responsible.
Allow all http traffic to the minimalist webserver "accessdenied.your.net",
which does not necessarily need to be inside your own network;
the unix program "/bin/cat /path/accessdenied" (which contains http headers+html text) might even be sufficient if it is invoked from a inetd substitutes
with process fork limits. Similar programs may suffice for ftp
and smtp access servers. Linux boxes (and probably *BSD boxes
as well) have the capability to redirect blocked traffic to
a specified port on the firewall, where you can have a protocal
specific error server. Here is a sample error message file,
asuming your 554 is a suitable error number for the protocol
and it uses smtp/ftp style messages (http and imap need a little
change).
554- Access has been blocked at your.net. to/from certain networks
554- because of netabuse or security violations originating at
554- or aided by the blocked networks.
554- For further information, including where to direct your complaints,
554 visit http://accessdenied.your.net
Any decent client will display these error messages to the user.
Failure to adequately control your upstream links
You should have an upstream provider with a responsive 24 hour NOC
which can and will reprogram the routers for you link as needed.
Alternatively, you can install your own router on your upstream
providers end of your upstream links. In some cases, this
will require that you have a direct router to router connection
so your upstream can protect against any shenagans (promiscuous
mode, masquerading, etc) from the router under your control.
These measures allow you to (possibly automatically) stop hostile
traffic (such as smurf echo replies) before it clogs your upstream
link.
Another possibility is to get your upstream provider to
implement a secure http based server which
allows you, and only you, to program certain characteristics,
and only those characteristics, of your link. Filter rules
(except those which prevent you from sending/forwarding traffic with
spoofed source addresses).
Check suspected abuse against known good guys
You should have a list of downstream connections or accounts which are
used by reputable security consultants. Your policy should require
your NOC to check against this list before disabling an account or line
because of abuse alarms. In such cases, you should check if
the attack is authorized before taking any action. It also makes
sense to compare the name/address on the credit card used on
a dailup with known administrators of the target network (particularly
those listed in whois).
Block spammers web sites
Redirect requests for web sites for companies which advertise via spam
to a page that indicates that access has been denied for unethical conduct.
Explain on that page that if they are trying to get past the block
(for example to collect info for retaliation) they can go through the
anonymizer.
Try to verify that the owners of the web site were the ones who
really sent the spam (i.e. it wasn't a frame). Posing as an
interested customer is usually helpful in this case.
Subscribe to vixie's validated spam blocking BGP feed
Dont Shoot the messenger
When someone points out a security problem in your network or systems,
do not blame or take action against the messenger. They are friend, not foe.
When the famous physicist Richard Feynman went around breaking into safes used
in the manhatten project and then reported his successes to security,
the morons handling security sent memos out to people to change their
combinations if Feynman had been in their offices recently rather
than telling them not to use their birthday, or other stupid choices,
as their safe combination. Don't fix the blame, fix the problem.
Security isn't important here
Many organizations neglect security, on the grounds that it isn't
important for their organization. This is wrong. Academic
organizations often suffer from this attitude.
- Your site, once compromised, may be used to attack other
sites. You may even be liable if your site is used to harm others
because of your negligence. Your connectivity to all or parts of
the internet may be terminated if you cannot control access
and stop abuse.
- How would your organization be affected if all the computers
were unavailible or malfunctioned simultaneously? Particularly
at a crucial time? How would you feel if you couldn't pay your
rent because someone crashed all the machines in payroll? If you
couldn't meet your own deadlines because someone crashed your
machines?
- How many people use secure services from within your net?
How many people order books online with a credit card, for example,
from your campus computers?
- You may have much more confidential information than you realize.
How would your customers react if their credit card number were
stolen off of your online store? How would your students react
when a list of the self help books they checked out
from the campus library was made public?
- Do visitors to your campus log in to their home systems from
your network?
Failure to engineer networks with a sniffer resistant topology
Use switches, bridges, or routers instead of hubs, at least in strategic
locations but preferrably everywhere. Use good devices which
can have learning disabled and be hard configured for more protection.
No host in your web farm, for example, should be able to evesdrop
on traffic to/from any other host (particularly those owned, operated,
or controlled by a different organization).
Consider setting up a trace host
Particularly if your NOC is not responsive and 24x7,
you may want to operate a tracing host which allows authorized
downstream providers to trace traffic which is addressed to
their own networks only. Tcpdump may be used for this purpose
with restrictions imposed on the filters used (or pipe one
tcpdump into another with user specified parameters (no shell escapes,
-w, or buffer overflows should be permitted). Some skill
is required to do this safely.
Or, consider uploading a report from your abuse monitors to
your web server every 5 minutes. This report should include
clear abuses only to protect privacy.
24x7 NOC or staff on call
Networks of any significant size need to have a 24x7 NOC or
at least have knowledgeable people with portable/home computers
on call 24 hours a day. If you don't have a 24x7 NOC, you should
consider contracting an outside firm to provide that.
A 24x7 answering service is not a 24x7 NOC.
Rapid response is important for tracking down and stopping many types
of net abuse.
Educate your network customers
Make information availible to your downstream networks administrators
who may be less experienced. You might want to link to this page.
Make sure your dialup customers understand they must not do business
with spammers. Make sure they understand that spammers often
try to make untargetted unsolicited email look like targetted
mail; just because you actually are interested in the products being offered
doesn't mean they didn't send the advertisement to 65 million other
people who weren't.
Cooperation
Cooperation with other network providers is important in preventing
abuse and diagnosing problems.
NOCs need to be responsive to queries from other NOCs.
The preferred solutions to many problems require a significant
development effort. Network providers can cooperate by sharing
internally developed tools with others or by organizing
with others to fund development of tools which meet a common need.
Tapping the network infrastructure fund might be appropriate
for some of these tools.
Net abuse is one of the areas where the old maxim "If you aren't part of the
solution then you are part of the problem" applies.
Failure to support required or recommended email addresses
abuse@yourdomain.net, postmaster@yourdomain.net, usenet@yourdomain.net,
domain@yourdomain.net, noc@yourdomain.net, support@yourdomain.net,
etc. See
RFC2142.txt.
Web host screwups
-
Failure to have enough disk space availible. You should have enough
disk space to satisfy the amount you promised every customer. If you
do oversell your disk space, it is your obligation to monitor disk
space and add drives (or move accounts) long before the disk fills up.
-
Make sure you put "UserDir disable" inside each "" section
in httpd.conf if you have defined "UserDir" in the main section.
Otherwise, you will screw up other peoples namespaces and also create
a security hole.
Failure to outsource or hire qualified people
Outsource those services which your network is not big enough to
handle or which your staff is not qualified or is too overworked to
deal with.
- engineering consultants
- Security
- NOC, at least for 24 hour service.
An answering service doesn't cut it.
- news server administration or news service.
Another thing to outsource is roaming service, or join a cooperative
roaming exchange network like iPass.
Three very common reasons someone might abandon their local ISP
Ok. This particular section is theroetically a bit self-serving
(because you might outsource some work to me). But failure to
outsource where appropriate
is one of the primary reasons that most small networks are likely
to go the way of the dodo bird, driven out of business by
large vertical monopolies with better economies of scale.
This is something I don't want to see happen.
You do of course need to use secure communication channels when
you allow contract personel to remotely administer some of
your systems.