[soen321-f04] more more more stuff (Sunday)[soen321-f04] more more more stuff (Sunday) David K. Probst PROBST at vax2.concordia.ca Sun Nov 14 10:37:21 EST 2004 Previous message: [soen321-f04] more more stuff (Sunday) Next message: [soen321-f04] material without transcripts (ignore dates); we did some of this Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] SOEN 321 Week "8" Lecture Material _________________________________ Intrusion detection ___________________ - what it is good for - how can you do it (anomaly detection, misuse detection) - how it can be compromised (at least, automatic forms of it) Principles __________ 1. Actions of users and processes are statistically predictable. 2. Actions of users and processes do not include sequences of commands that subvert security. 3. Actions of processes lie inside a set of actions permitted by the security policy. Violation of any of the above is an indicator of an attack. Example: An attacker wants to install a backdoor. He enters as an ordinary user and then becomes root. This is unusual (principle 1). While becoming root, the attacker uses "evil" sequences of commands (principle 2). Moreover, any changes to system files may cause them to behave in ways they are not supposed to (principle 3). Also, modifying a user file may allow processes executing on behalf of that user to, say, connect to sites they were unable to connect to before or by executing commands they could/did not execute before (principle 1). Note: Intrusions can be arbitrarily sophisticated. Definition: An _attack tool_ is an automated tool designed to violate a security policy. Example: 'rootkit' is an attack tool; among other things, it sniffs passwords. It also installs fake versions of system programs. These include fake: - netstat: conceals certain network connections - ps: conceals certain processes (e.g., the sniffer) - ls: conceals certain files (as does 'du') - ifconfig: conceals promiscuous mode - login: accepts "magic" password The cryptographic checksums of the fake programs are perfect. We see that most of the obvious traces of an attack have been hidden. But, there are still certain integrity checks: - used blocks plus free blocks = all blocks - 'ls' should tell the truth about file counts - load average should reflect number of running processes We can assess these quantities by programs _other_ than the fake ones and do the analysis. 'Rootkit' did not corrupt the kernel/file structure, so other programs will continue to report correct information; only the fake programs lie. Automation __________ Can we automate the intrusion-detection process? We want to automate: - anomaly detection - misuse detection - specification-based detection Goals of an IDS _______________ 1. Detect a wide variety of intrusions (including from insiders). 2. Detect intrusions in a timely fashion (both in real and non real time). 3. Digest/abstract. Present summary to expert human. 4. Accurracy (false positives, negatives). Generally, - anomaly models are statistically based - misuse models are signature based - specification models are specification based Anomaly detection requires adaptive statistical profiling. There is an infinite range of possibilities if only because there is an infinite range of variables. Suppose I'm a mole and I want to read sensitive files without attracting attention. My daily average is to peruse 5 files a day. So, I plan ahead, opening 6, then 7, then 8, ... until my profile will not be disturbed as I forage widely in files outside my "need-to-know" range. Urban legend: Nervous people are shaky (they emit "tells") when they use touchtone phones. Back-of-the-envelope absurdity, if untargeted. But, there is biometrics, typing rhythms, pseudopolygraphs. Formally, this requires either pure math (some form of statistics) or AI (machine learning of predictive models. Note that data mining is a generalization of statistics so many, many things are possible in principle. Misuse modeling _______________ Signature analysis actually generalizes to rule-based detection, but this requires a knowledge of system vulnerabilities. This gets us into different flavors of AI. One stab at a distinction: Does the sequence of data match any of the rules (of bad stuff) as opposed to is it kind of unusual? There is some brittleness here because you first have to have a good set of rules. If anomaly detection is the art of looking for unusual states, then misuse detection is the art of looking for states known to be bad. Agent _____ Who gathers the information? Obviously, it needs to be filtered (downselected) before it can be analyzed. - host-based information gathering: system and application logs - network-based information gathering: monitor network traffic, detect network-oriented attacks, use network sniffing The major subtlety is that the analysis software must ensure that the view of the network traffic is identical across the set {analyzer, host1, host2, ...}. Notifier ________ Automatic or incident response. Automatic: __________ | | <--- fw1 <--- | | | | <--- fw2 <--- Internet | | | | <--- fw3 <--- ---------- If the IDSs on fw1 and fw2 detect a coordinated attack, they can instruct fw3 to reject packets from the source(s) of the attacks. Let's look at an example of an IDS for detecting network intruders in real time. - passively monitor network link over which an intruder's traffic passes - filter the network-traffic stream into a sequence of higher-level events - feed events into an analysis engine that checks conformity with the security policy We will also - look at a number of attacks that attempt to subvert passive monitoring systems Design goals for a network monitor __________________________________ - no packet-filter drops: the missing traffic might contain precisely the interesting traffic; after all, an attacker might attack the monitor itself - real-time notification - extensible to include knowledge of new types of attacks Structure _________ - for each application protocol that you understand, capture every packet "tcp port finger or tcp port ftp or ...", i.e., any TCP packet with a source or destination port of 79 (finger), 21 (ftp), ... - also capture TCP packets with SYN, FIN, or RST bits set (these indicate beginning and two kinds of ends of a TCP connection) - event engine records state for IP address x IP adress x port x port by means of a finite-state machine As a trivial example, it might look for the following. Suppose the input received by an FTP server is USER nice \0 USER root Is this two separate commands or one? Will the monitor and the receiving host see precisely the same thing? Attacks on the monitor ______________________ - overload, crash, subterfuge 1. Overload: Drive the monitor to the point of overload, then attempt a network intrusion. Defense: Can be mysterious about how much load you can handle. 2. Crash attacks: Kill the monitor (failure, resource exhaustion), then attack as before. Defense: not particularly interesting. 3. Subterfuge attacks: The _key idea_ is to find a traffic pattern that is interpreted in one way by the monitor and by an entirely different way by the receiving endpoint (target host). The embedded NUL and fragmented IP datagrams that are re-assembled in different ways are trivial examples. Example: What if the monitor sees a particular packet that the endpoint does not? How could this be accomplished by the attacker? Launch a packet with an IP "Time to Live" (TTL) field just sufficient to get it past the monitor but not sufficient to reach the endpoint. Or, suppose the endpoint has a smaller "Maximum Transmission Unit" (MTU). Just send a packet that is too big with the "Do not fragment" bit set. By manipulating packts in this way, an attacker can send innocuous text for the benefit of the monitor, such as "USER nice", and then retransmit (using the same sequence numbers) attack text, such as "USER root", this time allowing it to reach the endpoints. Since the monitor discards packet duplicates, it won't be aware of the attack traffic. Defense: Require that retransmitted packets match exactly. But, natural errors? Examples of ftp checks: - log ftp requests directed to sensitive users - for any file-manipulation request, check for access to sensitive files - a guest should be denied access to user-configuration files, but users may access their own config files Definition: * Signature analysis * is when the IDS is programmed to interpret a certain series of packets, or a certain piece of data contained in a packet, as an attack. Example: An IDS that watches web servers might be programmed to look for the string "phf" as an indicator of a CGI-script attack. Most signature analysis is simple pattern matching: Find "phf" in "GET/cgi-bin/phf?" Let's look at insertion and evasion attacks. Insertion attack ________________ Here, an IDS accepts a packet that the end system rejects. Thus, the IDS alone might see the "funny" packets. This defeats signature analysis because it defeats the pattern matching. To illustrate, consider: GET/cgi-bin/....p....h....f....? where the dots stand for funny packets that only the monitor sees. Evasion attack ______________ Here, the end-system accepts a packet that the IDS rejects. There are all kinds of reasons why one computer can be more or less strict about accepting packets than another (checksums, fragments, sequence numbers, ...). Since the packets are rejected by the monitor, the attack is not seen. Conceptual basis ________________ IDS either - generates a model of a program's or system's response to known inputs, or - requires the generation of a rule base An IDS that monitors raises an alarm if it detects a deviation from the model or a rule fires. But how can you machine-learn the model? We need some human help. Can we detect when an application has been penetrated and is then exploited to do harm to other parts of the system? app | frame IDS problem as a sandbox problem | v specify allowed sequences of system calls O S Think of the (uncorrupted) application as a finite-state machine (transition system) whose outputs are sequences of system calls. A transition system can only generate _certain_ sequences of calls. If you observe an impossible call sequence, it is likely that an attacker has introduced malicious code into the application's thread of control by means of a buffer overflow, a format-string attack, or whatever. We detect the insertion of malicious code by observing sequences of system calls that normally could not have occurred. We use human intelligence to build a very abstract model of the application we wish to monitor. We quickly detect "exploit code" running with the application's privilege. We first build a very _abstract_ model of the application's (normal) _behavior_. Most existing exploit scripts grab full root privilege and take other distinctive actions, such as launching a shell under the attacker's control. This is so blatant it does not require a sophisticated IDS to detect it. But this abstract-model approach is able to detect when some of the backdoors (fake programs) of 'rootkit' are executed, which causes the behavior to deviate from that specified by the original source code and captured in the abstract model. Previous message: [soen321-f04] more more stuff (Sunday) Next message: [soen321-f04] material without transcripts (ignore dates); we did some of this Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] More information about the soen321-f04 mailing list