The first 32 values (0 through 31) are codes for things like carriage return and line feed. The space character is the 33rd value, followed by punctuation, digits, uppercase characters and lowercase characters.
0 NUL
1 SOH
2 STX
3 ETX
4 EOT
5 ENQ
6 ACK
7 BEL
8 BS
9 TAB
10 LF
11 VT
12 FF
13 CR
14 SO
15 SI
16 DLE
17 DC1
18 DC2
19 DC3
20 DC4
21 NAK
22 SYN
23 ETB
24 CAN
25 EM
26 SUB
27 ESC
28 FS
29 GS
30 RS
31 US
32
33 !
34 "
35 #
36 $
37 %
38 &
39 '
40 (
41 )
42 *
43 +
44 ,
45 -
46 .
47 /
48 0
49 1
50 2
51 3
52 4
53 5
54 6
55 7
56 8
57 9
58 :
59 ;
60 <
61 =
62 >
63 ?
64 @
65 A
66 B
67 C
68 D
69 E
70 F
71 G
72 H
73 I
74 J
75 K
76 L
77 M
78 N
79 O
80 P
81 Q
82 R
83 S
84 T
85 U
86 V
87 W
88 X
89 Y
90 Z
91 [
92 \
93 ]
94 ^
95 _
96 `
97 a
98 b
99 c
100 d
101 e
102 f
103 g
104 h
105 i
106 j
107 k
108 l
109 m
110 n
111 o
112 p
113 q
114 r
115 s
116 t
117 u
118 v
119 w
120 x
121 y
122 z
123 {
124 |
125 }
126 ~
127 DEL
Name
Abbr.
Size
Kilo
K
2^10 = 1,024
Mega
M
2^20 = 1,048,576
Giga
G
2^30 = 1,073,741,824
Tera
T
2^40 = 1,099,511,627,776
Peta
P
2^50 = 1,125,899,906,842,624
Exa
E
2^60 = 1,152,921,504,606,846,976
Zetta
Z
2^70 = 1,180,591,620,717,411,303,424
Yotta
Y
2^80 = 1,208,925,819,614,629,174,706,176
Tuesday, April 17, 2012
Everything You Need To Know About TCP/IP
TCP\IP or Transmission Control Protocol \ Internet Protocol is a stack or collection of various protocols. A
protocol is basically the commands or instructions using which two computers within a local network or the
Internet can exchange data or information and resources.
Transmission Control Protocol \ Internet Protocol or the TCP\IP was developed around the time of the
ARPAnet. It is also known as the Protocol Suite. It consists of various protocols but as the TCP
(Transmission Control Protocol) and the IP (Internet Protocol) are the most, well known of the suite of
protocols, the entire family or suite is called the TCP\IP suite.
The TCP\ IP Suite is a stacked suite with various layers stacked on each other, each layer looking after one
aspect of the data transfer. Data is transferred from one layer to the other. The Entire TCP\ IP suite can be
broken down into the below layers-:
Layer Name Protocol
Link Layer (Hardware, Ethernet) ARP, RARP, PPP, Ether
Network Layer(The Invisible Layer) IP, ICMP
Transport Layer UDP, TCP
Application Layer(The Visible Layer) The Actual running Applications like-: FTP client, Browser
Physical Layer (Not part of TCP \IP) Physical Data Cables, Telephone wires
Data travels from the Link Layer down to the Physical Layer at the source and at the destination it travels
from the Physical Layer to the Link Layer. We will later discuss what each layer and each protocol does.
The TCP\IP suite not only helps to transfer data but also has to correct various problems that might occur
during the data transfer. There are basically two types of most common errors that might occur during the
process of data transfer. They are-:
Data Corruption -: In this kind of error, the data reaches the destination after getting corrupted.
Data Loss -: In this kind of error, the entire collection of packets which constitute the data to be transferred
does not reach the destination.
TCP\IP expects such errors to take place and has certain features which prevent, such error which might
occur.
Checksums-: A checksum is a value (Normally, a 16 Bit Value) that is formed by summing up the Binary
Data in the used program for a given data block. The program being used is responsible for the calculation
of the Checksum value. The data being sent by the program sends this calculated checksum value, along
with the data packets to the destination. When the program running at the destination receives the data
packets, it re-calculates the Checksum value. If the Checksum value calculated by the Destination program
matches with the Checksum Value attached to the Data Packets by the Source Program match, then the data
transfer is said to be valid and error free. Checksum is calculated by adding up all the octets in a datagram.
Packet Sequencing-: All data being transferred on the net is broken down into packets at the source and
joined together at the destination. The data is broken down into packets in a particular sequence at the
source. This means that, for example, the first byte has the first sequence number and the second byte the
second sequence number and so on. These packets are free to travel independently on the net, so
sometimes, when the data packets reach the destination they arrive, out of sequence, which means that the
packet which had the first sequence number attached to it does not reach the destination first. Sequencing
defines the order in which the hosts receive the data packets or messages. The application or the layer
running at the destination automatically builds up the data from the sequence number in each packet.
The source system breaks the data to be transferred into smaller packets and assigns each packet a unique
sequence number. When the destination gets the packets, it's starts rearranging the packets by reading the
sequence numbers of each packet to make the data received usable.
For example, say you want to transfer a 18000 octet file. Not all networks can handle the entire 18000
octet packets at a time. So the huge file is broken down into smaller say 300 octet packets. Each packet has
been assigned a unique sequence number. Now when the packets reach the destination the packets are put
back together to get the usable data. Now during the transportation process, as the packets can move
independently on the net, it is possible that the packet 5 will arrive at the destination before packet 4
arrives. In such a situation, the sequence numbers are used by the destination to rearrange the data packets
in such a way that even if Data packet 5 arrived earlier, Packet 4 will always precede Packet 5.
A data can easily be corrupted while it is being transferred from the source to the destination. Now if a
error control service is running then if it detects data corruption, then it asks the source to re-send the
packets of data. Thus only non corrupted data reaches the destination. An error control service detects and
controls the same two types of errors-:
1.) Data Loss
2.) Data Corruption
The Checksum values are used to detect if the data has been modified or corrupted during the transfer from
source to destination or any corruption in the communication channel which may have caused data loss.
Data Corruption is detected by the Checksum Values and by performing Cyclic Redundancy Checks
(CRC 's). CRC 's too like the Checksums are integer values but require intensely advanced calculation and
hence are rarely used.
There is yet another way of detecting data corruption-: Handshaking.
This feature ensures demands that both the source and destination must transmit and receive
acknowledgement messages, that confirm transfer of uncorrupted data. Such acknowledgement messages
are known as ACK messages.
Let's take an example of a typical scenario of data transfer between two systems.
Source Sends MSG1 to Destination. It will not send MSG2 to Destination unless and until it gets the MSG
ACK and destination will not send more requests for data or the next request message (MSG2) unless it
gets the ACK from Source confirming that the MSG1 ACK was received by it. If the source does not get a
ACK message from the destination, then something which is called a timed-out occurs and the source will
re send the data to destination.
So this means that if A sends a data packet to B and B checksums the data packet and finds the data
corrupted, then it can simply delete for a time out to take place. Once the time out takes place, A will re
send the data packet to B. But this kind of system of deleting corrupt data is not used as it is inefficient and
time consuming.
Instead of deleting the corrupt data and waiting for a time out to take place, the destination (B) sends a not
acknowledged or NACK message to source(A). When A gets the NACK message, instead of waiting for a
time out to take place, it straightaway resends the data packet.
An ACK message of 1000 would mean that all data up to 1000 octets has been received till now.
TCP/ IP is a layered suite of protocols. All layers are equally important and with the absence of even a
single layer, data transfer would not have been possible. Each TCP/ IP layer contributes to the entire
process of data transfer. An excellent example, is when you send an email. For sending mail there is a
separate protocol, the SMTP protocol which belongs to the Application layer. The SMTP Application
protocol like all other application layer protocols assumes that there is a reliable connection existing
between the two computers. For the SMTP application protocol to do what it is designed for, i.e. to send
mail, it requires the existence of all other Layers as well. The Physical Layer i.e. cables and wires is
required to transport the data physically. The Transmission Control Protocol or the TCP protocol which
belongs to the Transport Layer is needed to keep track of the number of packets sent and for error
correction. It is this protocol that makes sure that the data reaches the other end. The TCP protocol is called
by the Application Protocol to ensure error free communication between the source and destination. For the
TCP layer to do its work properly i.e. to ensure that the data packets reach the destination, it requires the
existence of the Internet Protocol or IP. The IP protocol contains the Checksum and Source and
Destination IP address.
You may wonder why do we need different protocols like TCP and IP and why not bundle them into the
same Application protocol.? The TCP protocol contains commands or functions which are needed by
various application protocols like FTP, SMTP and also HTTP. The TCP protocol also calls on the IP
protocol, which in turn contains commands or functions which some application protocols require while
others don?t. So rather than bundling the entire TCP and IP protocol set into specific application protocols,
it is better to have different protocols which are called whenever required.
The Link Layer which is the Hardware or Ethernet layer is also needed for transportation of the data
packets. The PPP or the Point to Point Protocol belongs to this layer. Before we go on let's get accustomed
with certain TCP\IP terms. Most people get confused between datagrams and packets and think that they
are one and the same thing . You see, a datagram is a unit of data which is used by various protocols and a
packet is a physical object or thing which moves on a physical medium like a wire. There is a remarkable
difference between a Packet and a Datagram, but it is beyond the scope of this book. To make things easier
I will use only the term datagram (Actually this is the official term.)while discussing various protocols.
Two different main protocols are involved in transporting packets from source to destination.
1.) The Transmission Control Protocol or the TCP Protocol
2.) The Internet Protocol or the IP protocol.
Besides these two main protocols, the Physical Layer and the Ethernet Layer are also indispensable to data
transfer.
THE TRANSPORT LAYER
The TCP protocol
The Transmission Control Protocol is responsible for breaking up the data into smaller datagrams and
putting the datagrams back to form usable data at the destination. It also resends the lost datagrams to
destination where the received datagrams are reassembled in the right order. The TCP protocol does the
bulk of work but without the IP protocol, it cannot transfer data.
Let's take an example to make things more clearer. Let's say your Internet Protocol Address or IP address is
xxx.xxx.xxx.xxx or simply x and the destination's IP is yyy.yyy.yyy.yyy or simply y. Now As soon as the
three-way connection is established between x and y, x knows the destination IP address and also the Port
to which it is connected to. Both x and y are in different networks which can handle different sized packets.
So in order to send datagrams which are in receivable size, x must know what is the maximum datagram
size which y can handle. This too is determined by both x and y during connection time.
So once x knows the maximum size of the datagram which y can handle, it breaks down the data into
smaller chunks or datagrams. Each datagram has it's own TCP header which too is put by TCP.
A TCP Header contains a lot of information, but the most important of it is the Source and Destination IP
and Port numbers and yes also the sequence number.
**************
HACKING TRUTH: Learn more about Ports, IP's, Sockets in the Net Tools Manual
**************
The source which is your computer(x) now knows what the IP Addresses and Port Numbers of the
Destination and Source computers are. It now calculates the Checksum value by adding up all the octets of
the datagram and puts the final checksum value to the TCP Header. The different octets and not the
datagrams are then numbered. An octet would be a smaller broken down form of the entire data. TCP then
puts all this information into the TCP header of each datagram. A TCP Header of a datagram would finally
look like -:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| The Actual Data form the next 500 octets |
| |
There are certain new fields in the TCP header which you may not know off. Let's see what these new
fields signify. The Windows field specifies the octets of new data which is ready to be processed. You see
not all computers connected to the Internet run at the same speed and to ensure that a faster system does not
send datagrams to a slow system at a rate which is faster than it can handle, we use the Window field. As
the computer receives data , the space in the Window field gets decreased indicating that the receiver has
received the data. When it reaches zero the sender stops sending further packets. Once the receiver finishes
processing the received data, it increases the Window field, which in turn indicates that the receiver has
processed the earlier sent data and is ready to receive more chunks of data.
The Urgent Field tells the remote computer to stop processing the last octet and instead receive the new
octet. This is normally not commonly used.
The TCP protocol is a reliable protocol, which means that we have a guarantee that the data will arrive at
the destination properly and without any errors. It ensures that the data being received by the receiving end
is arranged in the same correct order in which it was sent.
The TCP Protocol relies on a virtual circuit between the client and the host. The circuit is opened via a 3
part process known as the three part handshake. It supports full duplex transportation of data which means
that it provides a path for two way data transfer. Hence using the TCP protocol, a computer can send and
receive datagrams at the same time.
Some common flags of TCP are-:
RST [RESET]- Resets the connection.
PSH [PUSH] - Tells receiver to pass all queued data to the application running.
FIN [FINISH] - Closes connection following the 4 step process.
SYN Flag - means that the machine sending this flag wants to establish a three way handshake i.e.
a TCP connection. The receiver of a SYN flag usually responds with an ACK message.
So now we are in a position to represent a three way TCP Handshake:
A <---SYN---> B
A <---SYN/ACK? B
A <---ACK---> B
A sends a SYN flag to B saying " I want to establish a TCP connection", B responds to the SYN with the
ACK to the SYN flag. A again responds to the ACK sent by B with another ACK.
Read RFC 793 for further in depth details about the TCP protocol.
The User Datagram Protocol or the UDP Protocol
The User Data protocol or the UDP is yet another protocol which is a member of the Transport Layer. TCP
is the standard protocol used by all systems for communications. TCP is used to break down the data to be
transported into smaller datagrams, before they (the datagrams) are sent across a network. Thus we can say
that TCP is used where more than a single or multiple datagrams are involved.
Sometimes, the data to be transported is able to fit into a single datagram. We do not need to break the data
into smaller datagrams as the size of the data is pretty small. The perfect example of such data is the DNS
system. To send out the query for a particular domain name, a single datagram is more than enough. Also
the IP that is returned by the Domain Name Server does not require more than one datagram for
transportation. So in such cases instead of making use of the complex TCP protocol, applications fall back
to the UDP protocol.
The UDP protocol works almost the way TCP works. But the only differences being that TCP breaks the
data to be transferred into smaller chunks, does sequencing by inserting a sequence number in the header
and no error control. Thus we can conclude by saying that the UDP protocol is an unreliable protocol with
no way to confirm that the data has reached the destination.
The UDP protocol does insert a USP header to the single datagram it is transporting. The UDP header
contains the Source and Destination IP Addresses and Port Numbers and also the Checksum value. The
UDP header is comparatively smaller than the TCP Header.
It is used by those applications where small chunks of data are involved. It offers services to the User's
Network Applications like NFS(Network File Sharing) and SNMP.
Read RFC 768 for further in depth details about the UDP protocol.
THE NETWORK LAYER
The IP Protocol
Both the TCP and the UDP protocols, after inserting the headers to the datagram(s) given to them pass
them to the Internet Protocol or the IP Protocol. The main job of the IP protocol is to find a way of
transporting the datagrams to the destination receiver. It does not do any kind of error checking.
The IP protocol too adds it's own IP Header to each datagram. The IP header contains the source and
destination IP addresses, the protocol number and yet another checksum. The IP header of a particular
datagram looks like-:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCP header info followed by the actual data being transferred|
| |
The Source and destination IP addresses and needed so that?well it is obvious isn't it? The Protocol
number is added so that the IP protocol knows to which Transport Protocol the datagram has to be passed.
You see various Transport Protocols are used like for example TCP or UDP. So this protocol number is
inserted to tell IP the protocol to which the datagram has to be passed.
It too inserts it's own Checksum value which is different from the Checksum Value inserted by the
Transport Protocols. This Checksum has to be inserted as without it the Internet Protocol will not be able to
verify if the Header has been damaged in the transfer process and hence the datagram might reach a wrong
destination. The Time to Live field specifies a value which is decreased each time the datagram passes
through a network. Remember Tracert?
The Internet Protocol Header contains other fields as well, but they are quite advanced and cannot be
included in a manual which gives an introduction to the TCP\IP protocol. To learn more about the IP
protocol read RFC 791.
The Internet Control Message Protocol or the ICMP
The ICMP protocol allows hosts to transfer information on errors that might have occurred during the data
transfer between two hosts. It is basically used to display error messages about errors that might occur
during the data transfer. The ICMP is a very simple protocol without any headers. It is most commonly
used to diagnose Network Problems. The famous utility PING is a part of the ICMP protocol. ICMP
requests do not require the user or application to mention any port number as all ICMP requests are
answered by the Network Software itself. The ICMP protocol too handles only a single datagram. That's
why we say in PING only a single datagram is sent to the remote computer. This protocol can remote many
network problems like Host Down, Congested Network etc
Read RFC 792 for further in depth details about the ICMP protocol.
The Link Layer
Almost all networks use Ethernet. Each machine in a network has it's own IP address and it's Ether
Address. The Ether Address of a computer is different than it's IP address. An Ether Address is a 42 bit
address while the IP address is only a 32 bit address. A Network must know which computer to deliver the
datagram to. Right? For this the Ether Header is used.
The Ether Header is a 14 octet header that contains the Source and Destination Ethernet address, and a type
code. Ether too calculates it's own Checksum value. The Type code relates to the protocol families to be
used within the Network. The Ether Layer passes the datagram to the protocol specified by this field after
inserting the Ether Header. There is simply no connection between the Ethernet Address and the IP address
of a machine. Each machine needs to have a Ethernet to IP address translation table on its hard disk.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet destination address (first 32 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet dest (last 16 bits) |Ethernet source (first 16 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet source address (last 32 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IP header, then TCP header, then your data |
| |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Address Resolution Protocol or ARP
Data before being transmitted across the Internet or across a local network is broken down into smaller
Packets which are suitable for transfer over the net. These packets have the Source and Destination IP's but
for the transfer to take place the suitable Hardware Addresses or the MAC addresses must also be known.
That is where ARP comes in.
To get the Hardware MAC addresses, ARP or Address Resolution Protocol sends a request message. The
Router replies with the Hardware Address. It is similar to the DNS and it too has a cache. This cache can be
a bit vulnerable as a Hacker could forge a connection from a remote machine claiming to be one of the
cached locations. So we can conclude that ARP translates IP's into Ethernet Addresses. One thing to
remember about ARP is that it only translates outgoing packets.
There is also something called the RARP which is an abbreviation for Reverse Address Resolution
Protocol, which like the name says does exactly reverse of what ARP does.
There is simply no algorithm to get the Ethernet Address from the IP Address. To carry out such
translations, each computer has a file which has a table with rows for each computer and two columns for
their corresponding IP address and Ethernet Address. The File is somewhat like the following-:
Internet Protocol Address Ethernet Address
Computer Name xxx.xy.yy.yx 08-00-39-00-2F-C3
Say there are a system in a Network (A) and an unidentified system (B) contacts it. Now A only knows the
IP address of B. Now A will first try to identify whether B is the same network so that it can directly
communicate via Ethernet. So it will first check the IP to MAC address translation table which it has. If it
finds the IP in the table then well and good and A will establish a connection with B via Ethernet.
On the Other hand if A does not find any match for the specific IP, it will send out a request in the form of
a 'Broadcast'. All computers within the Network will receive this broadcast and will search their own IP to
MAC translation table and will reply with the necessary MAC address. A basic difference between an Ip
address and MAC address is that an IP is the form xxx.xxx.xxx.xxx and a MAC address is in the form
xx:xx:xx:xx:xx:xx and one is 32 bit while the other is 40 bit.
Read RFC 826 for further in depth details about the ARP protocol.
Application Layer
Till now you have learnt how data is broken down into smaller chunks, and transferred to the destination,
where the chunks are rearranged. But there is yet another aspect to a successful data transfer process, which
we have not discussed yet: The Application Protocols and the Application Layer itself. A host which
receives datagrams has many applications or services (daemons) running which are ready to establish a
TCP connection and accept a message. Datagrams travelling on the Internet must know which application
they have to establish connection with, which application they have to send the message to. A typical web
server will have the FTP daemon, the HTTP daemon, the POP daemon, and the SMTP daemon running.
Wouldn't the datagrams get confused as to which daemon to send the message to.
For the datagrams to know which computer to send the message to, we have IP addresses. The datagram
knows what daemon or application to send the message to by the Port Number attached to the IP address of
the Destination. A TCP address is actually fully described by 4 numbers; The IP address of the Source and
Destination and the TCP Port Numbers of each end to which data is to be sent. These numbers are found in
the TCP Header.
To make it simpler to understand I have included an excerpt from the Net Tools Chapter:
What is all the hype about socket programming? What exactly are sockets?
TCP\IP or Transmission Control Protocol\ Internet Protocol is the language or the protocol used by
computers to communicate with each other over the Internet. Say a computer whose IP address is
99.99.99.99 wants to communicate with another machine whose IP address is 98.98.98.98 then would will
happen?
The machine whose IP is 99.99.99.99 sends a packet addressed to another machine whose IP is
98.98.98.98. When 98.98.98.98 receives the packet then it verifies that it got the message by sending a
signal back to 99.99.99.99.But say the person who is using 99.99.99.99 wants to have simultaneously more
than one connections to 98.98.98.98.....then what will happen? Say 99.99.99.99 wants to connect to
the FTP daemon and download a file by FTP and at the same time it wants to connect to 98.98.98.98's
website i.e. The HTTP daemon. Then 98.98.98.98. will have 2 connects with 99.99.99.99 simultaneously.
Now how can 98.98.98.98.distinguish between the two connections...how does 98.98.98.98. know which
is for the FTP daemon and which for the HTTP daemon? If there was no way to distinguish between the
two connections then they would both get mixed up and there would be a lot of chaos with the message
meant for the HTTP daemon going to the FTP daemon. To avoid such confusion we have ports. At each
port a particular service or daemon is running by default. So now that the 99.99.99.99 computers knows
which port to connect to, to download a FTP file and which port to connect to, to download the web page,
it will communicate with the 98.98.98.98 machine using what is known as the socket pair which is a
combination of an IP address and a Port. So in the above case the message which is meant for the FTP
daemon will be addressed to 98.98.98.98 : 21 (Notice the colon and the default FTP port suceeding it.).
So that the receiving machine i.e. 98.98.98.98 will know for which service this message is meant for and to
which port it should be directed to.
In TCP\IP or over the Internet all communication is done using the Socket pair i.e. the combination of the
IP address and the port.
*****************
HACKING TRUTH: Learn More about Ports, IP addresses and Sockets by reading the Net Tools Chapter.
*****************
The Application Layers basically consists of the Applications running on your computer and the
Applications running on the host to which you are connected. Say you are viewing the Hotmail Site, then
the application layer comprises of the Web Browser running on your computer and the HTTP daemon
running at Hotmail's server and the Application Protocol being used to communicate is HyperText Transfer
Protocol.
As soon as a TCP connection is established the Applications running on Each end decide the language or
protocol to be used to communicate and send datagrams.
IP Spoofing Torn Apart
IP spoofing is the most exciting topic you will hear wannabe hackers talking about. It is also a subject
about which no one knows much. Before we continue I would like to tell you that IP Spoofing is quite
difficult to understand and a lot of people have trouble understanding how it is done. The other downside it
has is the fact that it can almost not be done using a Windows system and a system administrator can easily
protect his system from IP spoofing
So what is IP Spoofing? IP Spoofing is a trick played on servers to fool the target computer into thinking
that it is receiving data from a source other than you. This in turn basically means to send data to a remote
host so that it believes that the data is coming from a computer whose IP address is something other than
yours. Let's take an example to make it clear:
Your IP is : 203.45.98.01 (REAL)
IP of Victim computer is: 202.14.12.1 (VICTIM)
IP you want data to be sent from: 173.23.45.89 (FAKE)
Normally sitting on the computer whose IP is REAL, the datagrams you send to VICTIM will appear to
have come from REAL. Now consider a situation in which you want to send a datagram to VICTIM and
make him believe that it came from a computer whose IP is FAKE. This is when you perform IP Spoofing.
The Main problem with IP Spoofing is that even if you are able to send a spoofed datagram to the remote
host, the remote host will reply not to your real IP but to the Fake IP you made your datagram seem to have
come from. Getting confused? Read the following example to clear up your mind.
Taking the same IP's as in the last example, consider the following scenario. Now, if REAL connects to
VICTIM, after the standard three way handshake has taken place, and VICTIM sends an ACK message to
REAL. Now if you spoof you IP, to say FAKE, then VICTIM will try to establish a TCP connection and
will send an ACK message to FAKE. Now lets assume that FAKE is alive, then as it had not requested the
ACK message (sent by VICTIM to FAKE) it will reply with a NACK message which would basically end
the connection and no further communication between FAKE and VICTIM would take place. Now if
FAKE doesn't exist then the ACK message sent by VICTIM will not get any reply and in the end the
connection times out.
Due to this FAKE and REAL IP reasons, when a person is trying to perform an IP Spoof, he does not get
any response from the remote host and has no clue whether he has been successful or not. If he has made
any progress or not. You are as good as blind, with no medium through which you could get feedback.
IP Spoofing can be successful only if the computer with the FAKE IP does not reply to the victim and not
interrupt the spoofed connection. Take the example of a telephone conversation, you can call up a person
' x ' and pretend to be ' y ' as long as ' y ' does not interrupt the conversation and give the game away.
So why would you need to perform IP Spoofing-:
1.) To Pretend that you are some other computer whose IP address is amongst the trusted list of computers
on the victim's disk. This way you are exploit the 'r' services and gain access to the network as you are
then believed to be from a trusted source.
2.) To Disguise or Mask your IP address so that the victim does not know who you really are and where
the data is coming from.
If you ever read the alt.2600 or the alt.hacking newsgroup, you would probably find many postings like "I
have Win98, how do I Spoof my IP" or even " I do not know TCP/IP. tell me how to perform IP spoofing".
You see the very fact that they are posting such questions and expect to learn how to spoof their IP without
even knowing a bit about TCP\IP, confirms the fact that they would not be able to perform IP Spoofing. No
I am not saying that asking questions is bad, but you see not knowing something is not so bad, but not
knowing something and showing ignorance towards learning it is really, really bad.
You see IP spoofing is a very complex and difficult to perform subject. You need to hog entire TCP/IP and
Networking Protocols manuals and need to be able to write C programs which will help you in the
Spoofing process. It is amazing how people even think that they can spoof their IP without even knowing
what TCP/IP stands for.
You see all packets travelling across the Internet have headers which contain the source and destination IP
addresses and port numbers, so that the packet knows where to go and the destination knows where the
packet has come from and where to respond. Now the process of Spoofing means to change the source IP
address contains by the Header of the packet, in turn fooling the receiver of the Packets into believing that
the packet came from somewhere else, which is a fake IP. Now let's again look at the IP Header of a
datagram.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCP header info followed by the actual data being transferred |
|
Now basically to perform IP spoofing we need to be able to change the value of the field, Source Address.
Now to this you need to be able to guess sequence numbers which is quite a sophisticated process and I will
try to explain it as clearly as possible. Before we go on, you need to understand the fact the IP spoofing is
not the entire process, it is just a stepping stop in the entire process of fooling the remote host and
establishing a trust relationship with the remote host.
So how do these trust relationships take place? Well all of you are encountered with some form of
authentication process or the other. Now the Username-Password pair is the most commonly used form of
authentication, with which we are very much familiar. Now what happens in the Username-Password form
of authentication is that the remote host to which the client is connected to challenges the client by asking
the User to type in the Username and Password. So in this form of authentication, the User needs to
intervened and the remote host challenges the user to enter the Username and Password which act as a from
of authentication.
Now other than the Password-Username form of authentication there is yet another form of authentication
most users do not know of. This is the Client IP. In this form of authentication, what happens is that the
remote host gets or find out the IP address of the client and compares it with a predefined list of IP's. If the
IP of the client who is trying to establish a connection with the remote host is found in the list of IP's
maintained by the host, then it allows the client access to the shell 'without a password' as the identity of
the client has already been authenticated.
Such kind of rust relationships are common in Unix Systems which have certain 'R services' like rsh ,
rlogin , rcp which have certain security problems and should be avoided. Despite the threat involved most
ISP's in India still keep the ports of the R services open to be exploited by Hackers. You normally establish
a Rlogin trust relationship by using the Unix command,
$>rlogin IP address
**************
HACKING TRUTH: Well there is definitely a cooler way of establishing a trust relationship with a remote
host, using Telnet. The default port numbers at which the R services run are 512, 513,514
**************
So how do I spoof my IP? Well in short, to spoof your IP, you need to be able to predict sequence numbers,
this will clearer after reading then next few paragraphs.
To understand Sequence Numbers you need to go back to, how the TCP protocol works. You already
know that TCP is a reliable protocol and has certain in-built features which have the ability to rearrange, re-
send lost, duplicated or out of sequence data. To make sure that the destination is able to rearrange the
datagrams in the correct order, TCP inserts two sequence numbers into each TCP datagram. One Sequence
number tells the receiving computer where a particular datagram belongs while the second sequence
number says how much data has been received by the sender. Anyway, let's move on, TCP also relies on
ACK and NACK messages to ensure that all datagrams have reached the destination error free.
Now we need to reanalyze the TCP Header to understand certain other aspects of sequence numbers and
the ACK Number.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| The Actual Data form the next 500 octets |
| |
You see the TCP Header contains a Sequence Number which actually represents the sequence number of
the first byte of that particular TCP segment. A sequence number is a 32 Bit number which is attached to
all bytes (data) being exchanged across a Network. The ACK Number Field in the TCP header, actually
contains the value of the sequence number which it expects to be the next. Not only that, it also does what
it was meant to do, acknowledge data received. Confused? Read it again till you get the hang of it.
When a connection is established, the initial sequence number or ISN is initialized to 1. This ISN number
is then incremented by 128,000 every second. There is a certain patter according to which the sequence
numbers increment or change which makes then easy to predict.
To successfully perform IP spoofing or in order to predict Sequence Numbers, you need to be running a
form of UNIX, as Windows does not provide the users with access to really advanced system stuff.
Without a form of Unix IP Spoofing is almost impossible to do.
This text is not the ultimate guide to IP Spoofing and was aimed at only giving you a general outline of the
whole process. Sequence number Prediction is really, really sophisticated and difficult to understand, but
not impossible to do. However a system administrator can easily save his systems from IP spoofing and this
actually makes it quite useless, nonetheless truly exciting. If You really want to learn IP Spoofing I suggest
you read IP Spoofing Demystified by daemon9/route/infinity which was a part of Issue 48 of PHRACK
magazine, File 14 of 18. Go to the Archive Section of their site, http://www.phrack.com and click on Issue
48.
This brings me to the other purpose people use IP Spoofing, IP Masking. Now to something as simple as
mask or hide your IP you do not need to go through the complex procedure of guessing sequence numbers
and performing IP Spoofing. There are proxy servers to do that for you. Read the Net Tools chapter for
further details.
Port Scanning in Networking Terms
Earlier we learnt what a Port scan is why it is considered to be such a important tool of getting information
about the remote host, which in turn can be used to exploit any vulnerabilities and break into the system.
We all know how a manual Port Scan works. You launch Telnet and manually Telnet to each Port jotting
down information that you think is important. In a manual Port Scan, when you telnet to a port of a remote
host, a full three way handshake takes place, which means that a complete TCP connection opens.
The earliest and the oldest version of Port Scanners used the same technique. They connected to each port
and established a full three way handshake for a complete TCP connection. The downside of such port
scanners was the fact that as a full TCP connection was being established, the system administrator could
easily detect that someone is trying to port scan his systems to find a vulnerability. However such port
scanning methods also had a bright side, as an actual TCP connection was being established, the port
scanning software did not have to build a Fake Internet Protocol Packet. (This IP Packet is used to scan the
remote systems.) Such TCP scanners too relied on the three-way TCP handshake to detect if a port is open
or not. The Basic process of detecting whether a port is open or not has been described below:
1.) You send a TCP Packet containing the SYN flag to remote host.
2.) Now the remote host checks whether the port is open or not. If the port is open then it replies with a
TCP packet containing both an ACK message confirming that the port is open and a SYN flag. On the
other hand if the port is closed then the remote host sends the RST flag which resets the connection, in
short closes the connection.
3.) This third phase is optional and involves the sending of an ACK message by the client.
As TCP Scanners were detectable, programmers around the world developed a new kind of port scanner,
the SYN Scanner, which did not establish a complete TCP connection. These kinds of port scanners remain
undetectable by only sending the first single TCP Packet containing the SYN flag and establishing a half
TCP Connection. T understand the working of a SYN or Half SYN Port Scanner simply read its 4 step
working-:
1. SYN Port Scanner sends the first TCP packet containing the SYN flag to the remote host.
2. The remote system replies with, either a SYN plus ACK or a RST.
3. When the SYN Port scanner receives one of the above responses, it knows whether the respective port
is open or not and whether a daemon is ready listening for connections.
The SYN Port Scanners were undetectable by most normal system port scan detectors, however newer post
scan detectors like netstat and also some firewalls can filter out such scans. Another downside to such
scanning is that the method in which the scanner makes the IP packet varies from system to system.
UDP Scanning
It is yet another port scanning technique which can be used to scan a UDP port to see if it is listening. To
detect an open UDP port, simply send a single UDP Packet to the port. If it is listening, you will get the
response, if it is not, then ICMP takes over and displays the error message, " Destination Port
Unreachable".
FIN Port Scanners
FIN Port Scanners are my favorite type of port scanners. They send a single packet containg the FIN flag. If
the remote host returns a RST flag then the port is closed, if no RST flag is returned, then it is open and
listening.
Some port scanners also use the technique of sending a ACK packet and if the Time To Live or ttl of the
returning packets is lower than the RST packets received (earlier), or if the windows size is greater than
zero, then the port is probably open and listening.
The Following is the code of a supposedly Stealth Port Scanner which appeared in the Phrack Magazine.
/*
* scantcp.c
*
* version 1.32
*
* Scans for listening TCP ports by sending packets to them and waiting for
* replies. Relys upon the TCP specs and some TCP implementation bugs found
* when viewing tcpdump logs.
*
* As always, portions recycled (eventually, with some stops) from n00k.c
* (Wow, that little piece of code I wrote long ago still serves as the base
* interface for newer tools)
*
* Technique:
* 1. Active scanning: not supported - why bother.
*
* 2. Half-open scanning:
* a. send SYN
* b. if reply is SYN|ACK send RST, port is listening
* c. if reply is RST, port is not listening
*
* 3. Stealth scanning: (works on nearly all systems tested)
* a. sends FIN
* b. if RST is returned, not listening.
* c. otherwise, port is probably listening.
*
* (This bug in many TCP implementations is not limited to FIN only; in fact
* many other flag combinations will have similar effects. FIN alone was
* selected because always returns a plain RST when not listening, and the
* code here was fit to handle RSTs already so it took me like 2 minutes
* to add this scanning method)
*
* 4. Stealth scanning: (may not work on all systems)
* a. sends ACK
* b. waits for RST
* c. if TTL is low or window is not 0, port is probably listening.
*
* (stealth scanning was created after I watched some tcpdump logs with
* these symptoms. The low-TTL implementation bug is currently believed
* to appear on Linux only, the non-zero window on ACK seems to exists on
* all BSDs.)
*
* CHANGES:
* --------
* 0. (v1.0)
* - First code, worked but was put aside since I didn't have time nor
* need to continue developing it.
* 1. (v1.1)
* - BASE CODE MOSTLY REWRITTEN (the old code wasn't that maintainable)
* - Added code to actually enforce the usecond-delay without usleep()
* (replies might be lost if usleep()ing)
* 2. (v1.2)
* - Added another stealth scanning method (FIN).
* Tested and passed on:
* AIX 3
* AIX 4
* IRIX 5.3
* SunOS 4.1.3
* System V 4.0
* Linux
* FreeBSD
* Solaris
*
* Tested and failed on:
* Cisco router with services on ( IOS 11.0)
*
* 3. (v1.21)
* - Code commented since I intend on abandoning this for a while.
*
* 4. (v1.3)
* - Resending for ports that weren't replied for.
* (took some modifications in the internal structures. this also
* makes it possible to use non-linear port ranges
* (say 1-1024 and 6000))
*
* 5. (v1.31)
* - Flood detection - will slow up the sending rate if not replies are
* recieved for STCP_THRESHOLD consecutive sends. Saves alot of resends
* on easily-flooded networks.
*
* 6. (v1.32)
* - Multiple port ranges support.
* The format is: <start-end>|<num>[,<start-end>|<num>,...]
*
* Examples: 20-26,113
* 20-100,113-150,6000,6660-6669
*
* PLANNED: (when I have time for this)
* ------------------------------------
* (v2.x) - Multiple flag combination selections, smart algorithm to point
* out uncommon replies and cross-check them with another flag
*
*/
#define RESOLVE_QUIET
#include <stdio.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/ip_tcp.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
#include "resolve.c"
#include "tcppkt03.c"
#define STCP_VERSION "1.32"
#define STCP_PORT 1234 /* Our local port. */
#define STCP_SENDS 3
#define STCP_THRESHOLD 8
#define STCP_SLOWFACTOR 10
/* GENERAL ROUTINES ------------------------------------------- */
void banner(void)
{
printf("\nscantcp\n");
printf("version %s\n",STCP_VERSION);
}
void usage(const char *progname)
{
printf("\nusage: \n");
printf("%s <method> <source> <dest> <ports> <udelay> <delay> [sf]\n\n",progname);
printf("\t<method> : 0: half-open scanning (type 0, SYN)\n");
printf("\t 1: stealth scanning (type 1, FIN)\n");
printf("\t 2: stealth scanning (type 2, ACK)\n");
printf("\t<source> : source address (this host)\n");
printf("\t<dest> : target to scan\n");
printf("\t<ports> : ports/and or ranges to scan - eg: 21-30,113,6000\n");
printf("\t<udelay> : microseconds to wait between TCP sends\n");
printf("\t<delay> : seconds to wait for TCP replies\n");
printf("\t[sf] : slow-factor in case sends are dectected to be too fast\n\n");
}
/* OPTION PARSING etc ---------------------------------------- */
unsigned char *dest_name;
unsigned char *spoof_name;
struct sockaddr_in destaddr;
unsigned long dest_addr;
unsigned long spoof_addr;
unsigned long usecdelay;
unsigned waitdelay;
int slowfactor = STCP_SLOWFACTOR;
struct portrec /* the port-data structure */
{
unsigned n;
int state;
unsigned char ttl;
unsigned short int window;
unsigned long int seq;
char sends;
} *ports;
char *portstr;
unsigned char scanflags;
int done;
int rawsock; /* socket descriptors */
int tcpsock;
int lastidx = 0; /* last sent index */
int maxports; /* total number of ports */
void timeout(int signum) /* timeout handler */
{ &
protocol is basically the commands or instructions using which two computers within a local network or the
Internet can exchange data or information and resources.
Transmission Control Protocol \ Internet Protocol or the TCP\IP was developed around the time of the
ARPAnet. It is also known as the Protocol Suite. It consists of various protocols but as the TCP
(Transmission Control Protocol) and the IP (Internet Protocol) are the most, well known of the suite of
protocols, the entire family or suite is called the TCP\IP suite.
The TCP\ IP Suite is a stacked suite with various layers stacked on each other, each layer looking after one
aspect of the data transfer. Data is transferred from one layer to the other. The Entire TCP\ IP suite can be
broken down into the below layers-:
Layer Name Protocol
Link Layer (Hardware, Ethernet) ARP, RARP, PPP, Ether
Network Layer(The Invisible Layer) IP, ICMP
Transport Layer UDP, TCP
Application Layer(The Visible Layer) The Actual running Applications like-: FTP client, Browser
Physical Layer (Not part of TCP \IP) Physical Data Cables, Telephone wires
Data travels from the Link Layer down to the Physical Layer at the source and at the destination it travels
from the Physical Layer to the Link Layer. We will later discuss what each layer and each protocol does.
The TCP\IP suite not only helps to transfer data but also has to correct various problems that might occur
during the data transfer. There are basically two types of most common errors that might occur during the
process of data transfer. They are-:
Data Corruption -: In this kind of error, the data reaches the destination after getting corrupted.
Data Loss -: In this kind of error, the entire collection of packets which constitute the data to be transferred
does not reach the destination.
TCP\IP expects such errors to take place and has certain features which prevent, such error which might
occur.
Checksums-: A checksum is a value (Normally, a 16 Bit Value) that is formed by summing up the Binary
Data in the used program for a given data block. The program being used is responsible for the calculation
of the Checksum value. The data being sent by the program sends this calculated checksum value, along
with the data packets to the destination. When the program running at the destination receives the data
packets, it re-calculates the Checksum value. If the Checksum value calculated by the Destination program
matches with the Checksum Value attached to the Data Packets by the Source Program match, then the data
transfer is said to be valid and error free. Checksum is calculated by adding up all the octets in a datagram.
Packet Sequencing-: All data being transferred on the net is broken down into packets at the source and
joined together at the destination. The data is broken down into packets in a particular sequence at the
source. This means that, for example, the first byte has the first sequence number and the second byte the
second sequence number and so on. These packets are free to travel independently on the net, so
sometimes, when the data packets reach the destination they arrive, out of sequence, which means that the
packet which had the first sequence number attached to it does not reach the destination first. Sequencing
defines the order in which the hosts receive the data packets or messages. The application or the layer
running at the destination automatically builds up the data from the sequence number in each packet.
The source system breaks the data to be transferred into smaller packets and assigns each packet a unique
sequence number. When the destination gets the packets, it's starts rearranging the packets by reading the
sequence numbers of each packet to make the data received usable.
For example, say you want to transfer a 18000 octet file. Not all networks can handle the entire 18000
octet packets at a time. So the huge file is broken down into smaller say 300 octet packets. Each packet has
been assigned a unique sequence number. Now when the packets reach the destination the packets are put
back together to get the usable data. Now during the transportation process, as the packets can move
independently on the net, it is possible that the packet 5 will arrive at the destination before packet 4
arrives. In such a situation, the sequence numbers are used by the destination to rearrange the data packets
in such a way that even if Data packet 5 arrived earlier, Packet 4 will always precede Packet 5.
A data can easily be corrupted while it is being transferred from the source to the destination. Now if a
error control service is running then if it detects data corruption, then it asks the source to re-send the
packets of data. Thus only non corrupted data reaches the destination. An error control service detects and
controls the same two types of errors-:
1.) Data Loss
2.) Data Corruption
The Checksum values are used to detect if the data has been modified or corrupted during the transfer from
source to destination or any corruption in the communication channel which may have caused data loss.
Data Corruption is detected by the Checksum Values and by performing Cyclic Redundancy Checks
(CRC 's). CRC 's too like the Checksums are integer values but require intensely advanced calculation and
hence are rarely used.
There is yet another way of detecting data corruption-: Handshaking.
This feature ensures demands that both the source and destination must transmit and receive
acknowledgement messages, that confirm transfer of uncorrupted data. Such acknowledgement messages
are known as ACK messages.
Let's take an example of a typical scenario of data transfer between two systems.
Source Sends MSG1 to Destination. It will not send MSG2 to Destination unless and until it gets the MSG
ACK and destination will not send more requests for data or the next request message (MSG2) unless it
gets the ACK from Source confirming that the MSG1 ACK was received by it. If the source does not get a
ACK message from the destination, then something which is called a timed-out occurs and the source will
re send the data to destination.
So this means that if A sends a data packet to B and B checksums the data packet and finds the data
corrupted, then it can simply delete for a time out to take place. Once the time out takes place, A will re
send the data packet to B. But this kind of system of deleting corrupt data is not used as it is inefficient and
time consuming.
Instead of deleting the corrupt data and waiting for a time out to take place, the destination (B) sends a not
acknowledged or NACK message to source(A). When A gets the NACK message, instead of waiting for a
time out to take place, it straightaway resends the data packet.
An ACK message of 1000 would mean that all data up to 1000 octets has been received till now.
TCP/ IP is a layered suite of protocols. All layers are equally important and with the absence of even a
single layer, data transfer would not have been possible. Each TCP/ IP layer contributes to the entire
process of data transfer. An excellent example, is when you send an email. For sending mail there is a
separate protocol, the SMTP protocol which belongs to the Application layer. The SMTP Application
protocol like all other application layer protocols assumes that there is a reliable connection existing
between the two computers. For the SMTP application protocol to do what it is designed for, i.e. to send
mail, it requires the existence of all other Layers as well. The Physical Layer i.e. cables and wires is
required to transport the data physically. The Transmission Control Protocol or the TCP protocol which
belongs to the Transport Layer is needed to keep track of the number of packets sent and for error
correction. It is this protocol that makes sure that the data reaches the other end. The TCP protocol is called
by the Application Protocol to ensure error free communication between the source and destination. For the
TCP layer to do its work properly i.e. to ensure that the data packets reach the destination, it requires the
existence of the Internet Protocol or IP. The IP protocol contains the Checksum and Source and
Destination IP address.
You may wonder why do we need different protocols like TCP and IP and why not bundle them into the
same Application protocol.? The TCP protocol contains commands or functions which are needed by
various application protocols like FTP, SMTP and also HTTP. The TCP protocol also calls on the IP
protocol, which in turn contains commands or functions which some application protocols require while
others don?t. So rather than bundling the entire TCP and IP protocol set into specific application protocols,
it is better to have different protocols which are called whenever required.
The Link Layer which is the Hardware or Ethernet layer is also needed for transportation of the data
packets. The PPP or the Point to Point Protocol belongs to this layer. Before we go on let's get accustomed
with certain TCP\IP terms. Most people get confused between datagrams and packets and think that they
are one and the same thing . You see, a datagram is a unit of data which is used by various protocols and a
packet is a physical object or thing which moves on a physical medium like a wire. There is a remarkable
difference between a Packet and a Datagram, but it is beyond the scope of this book. To make things easier
I will use only the term datagram (Actually this is the official term.)while discussing various protocols.
Two different main protocols are involved in transporting packets from source to destination.
1.) The Transmission Control Protocol or the TCP Protocol
2.) The Internet Protocol or the IP protocol.
Besides these two main protocols, the Physical Layer and the Ethernet Layer are also indispensable to data
transfer.
THE TRANSPORT LAYER
The TCP protocol
The Transmission Control Protocol is responsible for breaking up the data into smaller datagrams and
putting the datagrams back to form usable data at the destination. It also resends the lost datagrams to
destination where the received datagrams are reassembled in the right order. The TCP protocol does the
bulk of work but without the IP protocol, it cannot transfer data.
Let's take an example to make things more clearer. Let's say your Internet Protocol Address or IP address is
xxx.xxx.xxx.xxx or simply x and the destination's IP is yyy.yyy.yyy.yyy or simply y. Now As soon as the
three-way connection is established between x and y, x knows the destination IP address and also the Port
to which it is connected to. Both x and y are in different networks which can handle different sized packets.
So in order to send datagrams which are in receivable size, x must know what is the maximum datagram
size which y can handle. This too is determined by both x and y during connection time.
So once x knows the maximum size of the datagram which y can handle, it breaks down the data into
smaller chunks or datagrams. Each datagram has it's own TCP header which too is put by TCP.
A TCP Header contains a lot of information, but the most important of it is the Source and Destination IP
and Port numbers and yes also the sequence number.
**************
HACKING TRUTH: Learn more about Ports, IP's, Sockets in the Net Tools Manual
**************
The source which is your computer(x) now knows what the IP Addresses and Port Numbers of the
Destination and Source computers are. It now calculates the Checksum value by adding up all the octets of
the datagram and puts the final checksum value to the TCP Header. The different octets and not the
datagrams are then numbered. An octet would be a smaller broken down form of the entire data. TCP then
puts all this information into the TCP header of each datagram. A TCP Header of a datagram would finally
look like -:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| The Actual Data form the next 500 octets |
| |
There are certain new fields in the TCP header which you may not know off. Let's see what these new
fields signify. The Windows field specifies the octets of new data which is ready to be processed. You see
not all computers connected to the Internet run at the same speed and to ensure that a faster system does not
send datagrams to a slow system at a rate which is faster than it can handle, we use the Window field. As
the computer receives data , the space in the Window field gets decreased indicating that the receiver has
received the data. When it reaches zero the sender stops sending further packets. Once the receiver finishes
processing the received data, it increases the Window field, which in turn indicates that the receiver has
processed the earlier sent data and is ready to receive more chunks of data.
The Urgent Field tells the remote computer to stop processing the last octet and instead receive the new
octet. This is normally not commonly used.
The TCP protocol is a reliable protocol, which means that we have a guarantee that the data will arrive at
the destination properly and without any errors. It ensures that the data being received by the receiving end
is arranged in the same correct order in which it was sent.
The TCP Protocol relies on a virtual circuit between the client and the host. The circuit is opened via a 3
part process known as the three part handshake. It supports full duplex transportation of data which means
that it provides a path for two way data transfer. Hence using the TCP protocol, a computer can send and
receive datagrams at the same time.
Some common flags of TCP are-:
RST [RESET]- Resets the connection.
PSH [PUSH] - Tells receiver to pass all queued data to the application running.
FIN [FINISH] - Closes connection following the 4 step process.
SYN Flag - means that the machine sending this flag wants to establish a three way handshake i.e.
a TCP connection. The receiver of a SYN flag usually responds with an ACK message.
So now we are in a position to represent a three way TCP Handshake:
A <---SYN---> B
A <---SYN/ACK? B
A <---ACK---> B
A sends a SYN flag to B saying " I want to establish a TCP connection", B responds to the SYN with the
ACK to the SYN flag. A again responds to the ACK sent by B with another ACK.
Read RFC 793 for further in depth details about the TCP protocol.
The User Datagram Protocol or the UDP Protocol
The User Data protocol or the UDP is yet another protocol which is a member of the Transport Layer. TCP
is the standard protocol used by all systems for communications. TCP is used to break down the data to be
transported into smaller datagrams, before they (the datagrams) are sent across a network. Thus we can say
that TCP is used where more than a single or multiple datagrams are involved.
Sometimes, the data to be transported is able to fit into a single datagram. We do not need to break the data
into smaller datagrams as the size of the data is pretty small. The perfect example of such data is the DNS
system. To send out the query for a particular domain name, a single datagram is more than enough. Also
the IP that is returned by the Domain Name Server does not require more than one datagram for
transportation. So in such cases instead of making use of the complex TCP protocol, applications fall back
to the UDP protocol.
The UDP protocol works almost the way TCP works. But the only differences being that TCP breaks the
data to be transferred into smaller chunks, does sequencing by inserting a sequence number in the header
and no error control. Thus we can conclude by saying that the UDP protocol is an unreliable protocol with
no way to confirm that the data has reached the destination.
The UDP protocol does insert a USP header to the single datagram it is transporting. The UDP header
contains the Source and Destination IP Addresses and Port Numbers and also the Checksum value. The
UDP header is comparatively smaller than the TCP Header.
It is used by those applications where small chunks of data are involved. It offers services to the User's
Network Applications like NFS(Network File Sharing) and SNMP.
Read RFC 768 for further in depth details about the UDP protocol.
THE NETWORK LAYER
The IP Protocol
Both the TCP and the UDP protocols, after inserting the headers to the datagram(s) given to them pass
them to the Internet Protocol or the IP Protocol. The main job of the IP protocol is to find a way of
transporting the datagrams to the destination receiver. It does not do any kind of error checking.
The IP protocol too adds it's own IP Header to each datagram. The IP header contains the source and
destination IP addresses, the protocol number and yet another checksum. The IP header of a particular
datagram looks like-:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCP header info followed by the actual data being transferred|
| |
The Source and destination IP addresses and needed so that?well it is obvious isn't it? The Protocol
number is added so that the IP protocol knows to which Transport Protocol the datagram has to be passed.
You see various Transport Protocols are used like for example TCP or UDP. So this protocol number is
inserted to tell IP the protocol to which the datagram has to be passed.
It too inserts it's own Checksum value which is different from the Checksum Value inserted by the
Transport Protocols. This Checksum has to be inserted as without it the Internet Protocol will not be able to
verify if the Header has been damaged in the transfer process and hence the datagram might reach a wrong
destination. The Time to Live field specifies a value which is decreased each time the datagram passes
through a network. Remember Tracert?
The Internet Protocol Header contains other fields as well, but they are quite advanced and cannot be
included in a manual which gives an introduction to the TCP\IP protocol. To learn more about the IP
protocol read RFC 791.
The Internet Control Message Protocol or the ICMP
The ICMP protocol allows hosts to transfer information on errors that might have occurred during the data
transfer between two hosts. It is basically used to display error messages about errors that might occur
during the data transfer. The ICMP is a very simple protocol without any headers. It is most commonly
used to diagnose Network Problems. The famous utility PING is a part of the ICMP protocol. ICMP
requests do not require the user or application to mention any port number as all ICMP requests are
answered by the Network Software itself. The ICMP protocol too handles only a single datagram. That's
why we say in PING only a single datagram is sent to the remote computer. This protocol can remote many
network problems like Host Down, Congested Network etc
Read RFC 792 for further in depth details about the ICMP protocol.
The Link Layer
Almost all networks use Ethernet. Each machine in a network has it's own IP address and it's Ether
Address. The Ether Address of a computer is different than it's IP address. An Ether Address is a 42 bit
address while the IP address is only a 32 bit address. A Network must know which computer to deliver the
datagram to. Right? For this the Ether Header is used.
The Ether Header is a 14 octet header that contains the Source and Destination Ethernet address, and a type
code. Ether too calculates it's own Checksum value. The Type code relates to the protocol families to be
used within the Network. The Ether Layer passes the datagram to the protocol specified by this field after
inserting the Ether Header. There is simply no connection between the Ethernet Address and the IP address
of a machine. Each machine needs to have a Ethernet to IP address translation table on its hard disk.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet destination address (first 32 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet dest (last 16 bits) |Ethernet source (first 16 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet source address (last 32 bits) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type code |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| IP header, then TCP header, then your data |
| |
| |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Ethernet Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Address Resolution Protocol or ARP
Data before being transmitted across the Internet or across a local network is broken down into smaller
Packets which are suitable for transfer over the net. These packets have the Source and Destination IP's but
for the transfer to take place the suitable Hardware Addresses or the MAC addresses must also be known.
That is where ARP comes in.
To get the Hardware MAC addresses, ARP or Address Resolution Protocol sends a request message. The
Router replies with the Hardware Address. It is similar to the DNS and it too has a cache. This cache can be
a bit vulnerable as a Hacker could forge a connection from a remote machine claiming to be one of the
cached locations. So we can conclude that ARP translates IP's into Ethernet Addresses. One thing to
remember about ARP is that it only translates outgoing packets.
There is also something called the RARP which is an abbreviation for Reverse Address Resolution
Protocol, which like the name says does exactly reverse of what ARP does.
There is simply no algorithm to get the Ethernet Address from the IP Address. To carry out such
translations, each computer has a file which has a table with rows for each computer and two columns for
their corresponding IP address and Ethernet Address. The File is somewhat like the following-:
Internet Protocol Address Ethernet Address
Computer Name xxx.xy.yy.yx 08-00-39-00-2F-C3
Say there are a system in a Network (A) and an unidentified system (B) contacts it. Now A only knows the
IP address of B. Now A will first try to identify whether B is the same network so that it can directly
communicate via Ethernet. So it will first check the IP to MAC address translation table which it has. If it
finds the IP in the table then well and good and A will establish a connection with B via Ethernet.
On the Other hand if A does not find any match for the specific IP, it will send out a request in the form of
a 'Broadcast'. All computers within the Network will receive this broadcast and will search their own IP to
MAC translation table and will reply with the necessary MAC address. A basic difference between an Ip
address and MAC address is that an IP is the form xxx.xxx.xxx.xxx and a MAC address is in the form
xx:xx:xx:xx:xx:xx and one is 32 bit while the other is 40 bit.
Read RFC 826 for further in depth details about the ARP protocol.
Application Layer
Till now you have learnt how data is broken down into smaller chunks, and transferred to the destination,
where the chunks are rearranged. But there is yet another aspect to a successful data transfer process, which
we have not discussed yet: The Application Protocols and the Application Layer itself. A host which
receives datagrams has many applications or services (daemons) running which are ready to establish a
TCP connection and accept a message. Datagrams travelling on the Internet must know which application
they have to establish connection with, which application they have to send the message to. A typical web
server will have the FTP daemon, the HTTP daemon, the POP daemon, and the SMTP daemon running.
Wouldn't the datagrams get confused as to which daemon to send the message to.
For the datagrams to know which computer to send the message to, we have IP addresses. The datagram
knows what daemon or application to send the message to by the Port Number attached to the IP address of
the Destination. A TCP address is actually fully described by 4 numbers; The IP address of the Source and
Destination and the TCP Port Numbers of each end to which data is to be sent. These numbers are found in
the TCP Header.
To make it simpler to understand I have included an excerpt from the Net Tools Chapter:
What is all the hype about socket programming? What exactly are sockets?
TCP\IP or Transmission Control Protocol\ Internet Protocol is the language or the protocol used by
computers to communicate with each other over the Internet. Say a computer whose IP address is
99.99.99.99 wants to communicate with another machine whose IP address is 98.98.98.98 then would will
happen?
The machine whose IP is 99.99.99.99 sends a packet addressed to another machine whose IP is
98.98.98.98. When 98.98.98.98 receives the packet then it verifies that it got the message by sending a
signal back to 99.99.99.99.But say the person who is using 99.99.99.99 wants to have simultaneously more
than one connections to 98.98.98.98.....then what will happen? Say 99.99.99.99 wants to connect to
the FTP daemon and download a file by FTP and at the same time it wants to connect to 98.98.98.98's
website i.e. The HTTP daemon. Then 98.98.98.98. will have 2 connects with 99.99.99.99 simultaneously.
Now how can 98.98.98.98.distinguish between the two connections...how does 98.98.98.98. know which
is for the FTP daemon and which for the HTTP daemon? If there was no way to distinguish between the
two connections then they would both get mixed up and there would be a lot of chaos with the message
meant for the HTTP daemon going to the FTP daemon. To avoid such confusion we have ports. At each
port a particular service or daemon is running by default. So now that the 99.99.99.99 computers knows
which port to connect to, to download a FTP file and which port to connect to, to download the web page,
it will communicate with the 98.98.98.98 machine using what is known as the socket pair which is a
combination of an IP address and a Port. So in the above case the message which is meant for the FTP
daemon will be addressed to 98.98.98.98 : 21 (Notice the colon and the default FTP port suceeding it.).
So that the receiving machine i.e. 98.98.98.98 will know for which service this message is meant for and to
which port it should be directed to.
In TCP\IP or over the Internet all communication is done using the Socket pair i.e. the combination of the
IP address and the port.
*****************
HACKING TRUTH: Learn More about Ports, IP addresses and Sockets by reading the Net Tools Chapter.
*****************
The Application Layers basically consists of the Applications running on your computer and the
Applications running on the host to which you are connected. Say you are viewing the Hotmail Site, then
the application layer comprises of the Web Browser running on your computer and the HTTP daemon
running at Hotmail's server and the Application Protocol being used to communicate is HyperText Transfer
Protocol.
As soon as a TCP connection is established the Applications running on Each end decide the language or
protocol to be used to communicate and send datagrams.
IP Spoofing Torn Apart
IP spoofing is the most exciting topic you will hear wannabe hackers talking about. It is also a subject
about which no one knows much. Before we continue I would like to tell you that IP Spoofing is quite
difficult to understand and a lot of people have trouble understanding how it is done. The other downside it
has is the fact that it can almost not be done using a Windows system and a system administrator can easily
protect his system from IP spoofing
So what is IP Spoofing? IP Spoofing is a trick played on servers to fool the target computer into thinking
that it is receiving data from a source other than you. This in turn basically means to send data to a remote
host so that it believes that the data is coming from a computer whose IP address is something other than
yours. Let's take an example to make it clear:
Your IP is : 203.45.98.01 (REAL)
IP of Victim computer is: 202.14.12.1 (VICTIM)
IP you want data to be sent from: 173.23.45.89 (FAKE)
Normally sitting on the computer whose IP is REAL, the datagrams you send to VICTIM will appear to
have come from REAL. Now consider a situation in which you want to send a datagram to VICTIM and
make him believe that it came from a computer whose IP is FAKE. This is when you perform IP Spoofing.
The Main problem with IP Spoofing is that even if you are able to send a spoofed datagram to the remote
host, the remote host will reply not to your real IP but to the Fake IP you made your datagram seem to have
come from. Getting confused? Read the following example to clear up your mind.
Taking the same IP's as in the last example, consider the following scenario. Now, if REAL connects to
VICTIM, after the standard three way handshake has taken place, and VICTIM sends an ACK message to
REAL. Now if you spoof you IP, to say FAKE, then VICTIM will try to establish a TCP connection and
will send an ACK message to FAKE. Now lets assume that FAKE is alive, then as it had not requested the
ACK message (sent by VICTIM to FAKE) it will reply with a NACK message which would basically end
the connection and no further communication between FAKE and VICTIM would take place. Now if
FAKE doesn't exist then the ACK message sent by VICTIM will not get any reply and in the end the
connection times out.
Due to this FAKE and REAL IP reasons, when a person is trying to perform an IP Spoof, he does not get
any response from the remote host and has no clue whether he has been successful or not. If he has made
any progress or not. You are as good as blind, with no medium through which you could get feedback.
IP Spoofing can be successful only if the computer with the FAKE IP does not reply to the victim and not
interrupt the spoofed connection. Take the example of a telephone conversation, you can call up a person
' x ' and pretend to be ' y ' as long as ' y ' does not interrupt the conversation and give the game away.
So why would you need to perform IP Spoofing-:
1.) To Pretend that you are some other computer whose IP address is amongst the trusted list of computers
on the victim's disk. This way you are exploit the 'r' services and gain access to the network as you are
then believed to be from a trusted source.
2.) To Disguise or Mask your IP address so that the victim does not know who you really are and where
the data is coming from.
If you ever read the alt.2600 or the alt.hacking newsgroup, you would probably find many postings like "I
have Win98, how do I Spoof my IP" or even " I do not know TCP/IP. tell me how to perform IP spoofing".
You see the very fact that they are posting such questions and expect to learn how to spoof their IP without
even knowing a bit about TCP\IP, confirms the fact that they would not be able to perform IP Spoofing. No
I am not saying that asking questions is bad, but you see not knowing something is not so bad, but not
knowing something and showing ignorance towards learning it is really, really bad.
You see IP spoofing is a very complex and difficult to perform subject. You need to hog entire TCP/IP and
Networking Protocols manuals and need to be able to write C programs which will help you in the
Spoofing process. It is amazing how people even think that they can spoof their IP without even knowing
what TCP/IP stands for.
You see all packets travelling across the Internet have headers which contain the source and destination IP
addresses and port numbers, so that the packet knows where to go and the destination knows where the
packet has come from and where to respond. Now the process of Spoofing means to change the source IP
address contains by the Header of the packet, in turn fooling the receiver of the Packets into believing that
the packet came from somewhere else, which is a fake IP. Now let's again look at the IP Header of a
datagram.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TCP header info followed by the actual data being transferred |
|
Now basically to perform IP spoofing we need to be able to change the value of the field, Source Address.
Now to this you need to be able to guess sequence numbers which is quite a sophisticated process and I will
try to explain it as clearly as possible. Before we go on, you need to understand the fact the IP spoofing is
not the entire process, it is just a stepping stop in the entire process of fooling the remote host and
establishing a trust relationship with the remote host.
So how do these trust relationships take place? Well all of you are encountered with some form of
authentication process or the other. Now the Username-Password pair is the most commonly used form of
authentication, with which we are very much familiar. Now what happens in the Username-Password form
of authentication is that the remote host to which the client is connected to challenges the client by asking
the User to type in the Username and Password. So in this form of authentication, the User needs to
intervened and the remote host challenges the user to enter the Username and Password which act as a from
of authentication.
Now other than the Password-Username form of authentication there is yet another form of authentication
most users do not know of. This is the Client IP. In this form of authentication, what happens is that the
remote host gets or find out the IP address of the client and compares it with a predefined list of IP's. If the
IP of the client who is trying to establish a connection with the remote host is found in the list of IP's
maintained by the host, then it allows the client access to the shell 'without a password' as the identity of
the client has already been authenticated.
Such kind of rust relationships are common in Unix Systems which have certain 'R services' like rsh ,
rlogin , rcp which have certain security problems and should be avoided. Despite the threat involved most
ISP's in India still keep the ports of the R services open to be exploited by Hackers. You normally establish
a Rlogin trust relationship by using the Unix command,
$>rlogin IP address
**************
HACKING TRUTH: Well there is definitely a cooler way of establishing a trust relationship with a remote
host, using Telnet. The default port numbers at which the R services run are 512, 513,514
**************
So how do I spoof my IP? Well in short, to spoof your IP, you need to be able to predict sequence numbers,
this will clearer after reading then next few paragraphs.
To understand Sequence Numbers you need to go back to, how the TCP protocol works. You already
know that TCP is a reliable protocol and has certain in-built features which have the ability to rearrange, re-
send lost, duplicated or out of sequence data. To make sure that the destination is able to rearrange the
datagrams in the correct order, TCP inserts two sequence numbers into each TCP datagram. One Sequence
number tells the receiving computer where a particular datagram belongs while the second sequence
number says how much data has been received by the sender. Anyway, let's move on, TCP also relies on
ACK and NACK messages to ensure that all datagrams have reached the destination error free.
Now we need to reanalyze the TCP Header to understand certain other aspects of sequence numbers and
the ACK Number.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| The Actual Data form the next 500 octets |
| |
You see the TCP Header contains a Sequence Number which actually represents the sequence number of
the first byte of that particular TCP segment. A sequence number is a 32 Bit number which is attached to
all bytes (data) being exchanged across a Network. The ACK Number Field in the TCP header, actually
contains the value of the sequence number which it expects to be the next. Not only that, it also does what
it was meant to do, acknowledge data received. Confused? Read it again till you get the hang of it.
When a connection is established, the initial sequence number or ISN is initialized to 1. This ISN number
is then incremented by 128,000 every second. There is a certain patter according to which the sequence
numbers increment or change which makes then easy to predict.
To successfully perform IP spoofing or in order to predict Sequence Numbers, you need to be running a
form of UNIX, as Windows does not provide the users with access to really advanced system stuff.
Without a form of Unix IP Spoofing is almost impossible to do.
This text is not the ultimate guide to IP Spoofing and was aimed at only giving you a general outline of the
whole process. Sequence number Prediction is really, really sophisticated and difficult to understand, but
not impossible to do. However a system administrator can easily save his systems from IP spoofing and this
actually makes it quite useless, nonetheless truly exciting. If You really want to learn IP Spoofing I suggest
you read IP Spoofing Demystified by daemon9/route/infinity which was a part of Issue 48 of PHRACK
magazine, File 14 of 18. Go to the Archive Section of their site, http://www.phrack.com and click on Issue
48.
This brings me to the other purpose people use IP Spoofing, IP Masking. Now to something as simple as
mask or hide your IP you do not need to go through the complex procedure of guessing sequence numbers
and performing IP Spoofing. There are proxy servers to do that for you. Read the Net Tools chapter for
further details.
Port Scanning in Networking Terms
Earlier we learnt what a Port scan is why it is considered to be such a important tool of getting information
about the remote host, which in turn can be used to exploit any vulnerabilities and break into the system.
We all know how a manual Port Scan works. You launch Telnet and manually Telnet to each Port jotting
down information that you think is important. In a manual Port Scan, when you telnet to a port of a remote
host, a full three way handshake takes place, which means that a complete TCP connection opens.
The earliest and the oldest version of Port Scanners used the same technique. They connected to each port
and established a full three way handshake for a complete TCP connection. The downside of such port
scanners was the fact that as a full TCP connection was being established, the system administrator could
easily detect that someone is trying to port scan his systems to find a vulnerability. However such port
scanning methods also had a bright side, as an actual TCP connection was being established, the port
scanning software did not have to build a Fake Internet Protocol Packet. (This IP Packet is used to scan the
remote systems.) Such TCP scanners too relied on the three-way TCP handshake to detect if a port is open
or not. The Basic process of detecting whether a port is open or not has been described below:
1.) You send a TCP Packet containing the SYN flag to remote host.
2.) Now the remote host checks whether the port is open or not. If the port is open then it replies with a
TCP packet containing both an ACK message confirming that the port is open and a SYN flag. On the
other hand if the port is closed then the remote host sends the RST flag which resets the connection, in
short closes the connection.
3.) This third phase is optional and involves the sending of an ACK message by the client.
As TCP Scanners were detectable, programmers around the world developed a new kind of port scanner,
the SYN Scanner, which did not establish a complete TCP connection. These kinds of port scanners remain
undetectable by only sending the first single TCP Packet containing the SYN flag and establishing a half
TCP Connection. T understand the working of a SYN or Half SYN Port Scanner simply read its 4 step
working-:
1. SYN Port Scanner sends the first TCP packet containing the SYN flag to the remote host.
2. The remote system replies with, either a SYN plus ACK or a RST.
3. When the SYN Port scanner receives one of the above responses, it knows whether the respective port
is open or not and whether a daemon is ready listening for connections.
The SYN Port Scanners were undetectable by most normal system port scan detectors, however newer post
scan detectors like netstat and also some firewalls can filter out such scans. Another downside to such
scanning is that the method in which the scanner makes the IP packet varies from system to system.
UDP Scanning
It is yet another port scanning technique which can be used to scan a UDP port to see if it is listening. To
detect an open UDP port, simply send a single UDP Packet to the port. If it is listening, you will get the
response, if it is not, then ICMP takes over and displays the error message, " Destination Port
Unreachable".
FIN Port Scanners
FIN Port Scanners are my favorite type of port scanners. They send a single packet containg the FIN flag. If
the remote host returns a RST flag then the port is closed, if no RST flag is returned, then it is open and
listening.
Some port scanners also use the technique of sending a ACK packet and if the Time To Live or ttl of the
returning packets is lower than the RST packets received (earlier), or if the windows size is greater than
zero, then the port is probably open and listening.
The Following is the code of a supposedly Stealth Port Scanner which appeared in the Phrack Magazine.
/*
* scantcp.c
*
* version 1.32
*
* Scans for listening TCP ports by sending packets to them and waiting for
* replies. Relys upon the TCP specs and some TCP implementation bugs found
* when viewing tcpdump logs.
*
* As always, portions recycled (eventually, with some stops) from n00k.c
* (Wow, that little piece of code I wrote long ago still serves as the base
* interface for newer tools)
*
* Technique:
* 1. Active scanning: not supported - why bother.
*
* 2. Half-open scanning:
* a. send SYN
* b. if reply is SYN|ACK send RST, port is listening
* c. if reply is RST, port is not listening
*
* 3. Stealth scanning: (works on nearly all systems tested)
* a. sends FIN
* b. if RST is returned, not listening.
* c. otherwise, port is probably listening.
*
* (This bug in many TCP implementations is not limited to FIN only; in fact
* many other flag combinations will have similar effects. FIN alone was
* selected because always returns a plain RST when not listening, and the
* code here was fit to handle RSTs already so it took me like 2 minutes
* to add this scanning method)
*
* 4. Stealth scanning: (may not work on all systems)
* a. sends ACK
* b. waits for RST
* c. if TTL is low or window is not 0, port is probably listening.
*
* (stealth scanning was created after I watched some tcpdump logs with
* these symptoms. The low-TTL implementation bug is currently believed
* to appear on Linux only, the non-zero window on ACK seems to exists on
* all BSDs.)
*
* CHANGES:
* --------
* 0. (v1.0)
* - First code, worked but was put aside since I didn't have time nor
* need to continue developing it.
* 1. (v1.1)
* - BASE CODE MOSTLY REWRITTEN (the old code wasn't that maintainable)
* - Added code to actually enforce the usecond-delay without usleep()
* (replies might be lost if usleep()ing)
* 2. (v1.2)
* - Added another stealth scanning method (FIN).
* Tested and passed on:
* AIX 3
* AIX 4
* IRIX 5.3
* SunOS 4.1.3
* System V 4.0
* Linux
* FreeBSD
* Solaris
*
* Tested and failed on:
* Cisco router with services on ( IOS 11.0)
*
* 3. (v1.21)
* - Code commented since I intend on abandoning this for a while.
*
* 4. (v1.3)
* - Resending for ports that weren't replied for.
* (took some modifications in the internal structures. this also
* makes it possible to use non-linear port ranges
* (say 1-1024 and 6000))
*
* 5. (v1.31)
* - Flood detection - will slow up the sending rate if not replies are
* recieved for STCP_THRESHOLD consecutive sends. Saves alot of resends
* on easily-flooded networks.
*
* 6. (v1.32)
* - Multiple port ranges support.
* The format is: <start-end>|<num>[,<start-end>|<num>,...]
*
* Examples: 20-26,113
* 20-100,113-150,6000,6660-6669
*
* PLANNED: (when I have time for this)
* ------------------------------------
* (v2.x) - Multiple flag combination selections, smart algorithm to point
* out uncommon replies and cross-check them with another flag
*
*/
#define RESOLVE_QUIET
#include <stdio.h>
#include <netinet/in.h>
#include <netinet/ip.h>
#include <netinet/ip_tcp.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
#include "resolve.c"
#include "tcppkt03.c"
#define STCP_VERSION "1.32"
#define STCP_PORT 1234 /* Our local port. */
#define STCP_SENDS 3
#define STCP_THRESHOLD 8
#define STCP_SLOWFACTOR 10
/* GENERAL ROUTINES ------------------------------------------- */
void banner(void)
{
printf("\nscantcp\n");
printf("version %s\n",STCP_VERSION);
}
void usage(const char *progname)
{
printf("\nusage: \n");
printf("%s <method> <source> <dest> <ports> <udelay> <delay> [sf]\n\n",progname);
printf("\t<method> : 0: half-open scanning (type 0, SYN)\n");
printf("\t 1: stealth scanning (type 1, FIN)\n");
printf("\t 2: stealth scanning (type 2, ACK)\n");
printf("\t<source> : source address (this host)\n");
printf("\t<dest> : target to scan\n");
printf("\t<ports> : ports/and or ranges to scan - eg: 21-30,113,6000\n");
printf("\t<udelay> : microseconds to wait between TCP sends\n");
printf("\t<delay> : seconds to wait for TCP replies\n");
printf("\t[sf] : slow-factor in case sends are dectected to be too fast\n\n");
}
/* OPTION PARSING etc ---------------------------------------- */
unsigned char *dest_name;
unsigned char *spoof_name;
struct sockaddr_in destaddr;
unsigned long dest_addr;
unsigned long spoof_addr;
unsigned long usecdelay;
unsigned waitdelay;
int slowfactor = STCP_SLOWFACTOR;
struct portrec /* the port-data structure */
{
unsigned n;
int state;
unsigned char ttl;
unsigned short int window;
unsigned long int seq;
char sends;
} *ports;
char *portstr;
unsigned char scanflags;
int done;
int rawsock; /* socket descriptors */
int tcpsock;
int lastidx = 0; /* last sent index */
int maxports; /* total number of ports */
void timeout(int signum) /* timeout handler */
{ &
Top 5 Myths About Safe Surfing
There was a time when this statement was partially true, but that time has long since passed. Current viruses, worms, and other threats, including the famous Love Bug, Nimda, and Blaster, spread blindly across the Internet to thousands or millions of PCs in a matter of hours, without regard for who owns them, what is stored there, or the value of the information they hold. The purpose of such attacks is nothing less than to wreak havoc. If you ignore the reality of these attacks, you are certain to be hit at one time or another. Even if your computer is not attacked directly, it can be used as a zombie to launch a denial-of-service or other attack on a network or to send spam or pornography to other PCs without being traced. Therefore, your civic responsibility is to protect your PC so that others are protected.
I can protect my PC if I disconnect from the Internet or turn it off when I'm not using it.
Wrong. If you connect to the Internet at all, you are a target. You could download a virus when you connect and not activate it until days later when you read your e-mail off-line. Even if you rarely connect to the Internet, you can get a virus from a file off of a network, floppy disk, or USB flash memory drive.
I can protect myself from viruses by not opening suspicious e-mail attachments.
Wrong again. The next virus you get may come from your best friend's or boss' computer if his e-mail address book was used to propagate an attack. Nimda and other hybrid worms can enter through the Web browser. And it is possible to activate some viruses simply by reading or previewing an e-mail. You simply must have a PC-based antivirus package.
I have a Macintosh (or a Linux-based system), not a Windows system, so I don't have to worry about being attacked.
It is true that most attacks target Microsoft Windows–based PCs, but there have been attacks against Mac OS and Linux systems as well. Some experts have predicted that the Mac virus problem will get worse, because Mac OS X uses a version of Unix. And although these systems have some useful security features, they can still be attacked.
My system came with an antivirus package, so I'm protected.
Not quite. First, if you haven't activated your antivirus package to scan incoming traffic automatically, you are not protected against e-mail and Web browser attacks. Second, new threats appear daily, so an antivirus package is only as good as its last update. Activate the auto-update features to stay on top of the latest threats. Third, an antivirus package can't protect you from every threat. In most cases you need a combination of solutions, including, at minimum, antivirus, a personal firewall such as Zone Labs' ZoneAlarm Pro, and a plan for keeping your operating system and software up to date with security patches. Antispyware and antispam utilities (such as PepiMK Software's SpyBot Search & Destroy and Norton AntiSpam 2004) will also help keep you safe.
I can protect my PC if I disconnect from the Internet or turn it off when I'm not using it.
Wrong. If you connect to the Internet at all, you are a target. You could download a virus when you connect and not activate it until days later when you read your e-mail off-line. Even if you rarely connect to the Internet, you can get a virus from a file off of a network, floppy disk, or USB flash memory drive.
I can protect myself from viruses by not opening suspicious e-mail attachments.
Wrong again. The next virus you get may come from your best friend's or boss' computer if his e-mail address book was used to propagate an attack. Nimda and other hybrid worms can enter through the Web browser. And it is possible to activate some viruses simply by reading or previewing an e-mail. You simply must have a PC-based antivirus package.
I have a Macintosh (or a Linux-based system), not a Windows system, so I don't have to worry about being attacked.
It is true that most attacks target Microsoft Windows–based PCs, but there have been attacks against Mac OS and Linux systems as well. Some experts have predicted that the Mac virus problem will get worse, because Mac OS X uses a version of Unix. And although these systems have some useful security features, they can still be attacked.
My system came with an antivirus package, so I'm protected.
Not quite. First, if you haven't activated your antivirus package to scan incoming traffic automatically, you are not protected against e-mail and Web browser attacks. Second, new threats appear daily, so an antivirus package is only as good as its last update. Activate the auto-update features to stay on top of the latest threats. Third, an antivirus package can't protect you from every threat. In most cases you need a combination of solutions, including, at minimum, antivirus, a personal firewall such as Zone Labs' ZoneAlarm Pro, and a plan for keeping your operating system and software up to date with security patches. Antispyware and antispam utilities (such as PepiMK Software's SpyBot Search & Destroy and Norton AntiSpam 2004) will also help keep you safe.
Translating Binary To Text
Translating Binary to Text: The Hard Way
A Tutorial for those willing to Learn
Contents
1. Introduction
2. The Binary System
3. Converting Binary to ASCII (Text)
Introduction:
We’ve all seen binary code. We’ve come to think of them as a bunch of ones and zeroes in long strings…
010010101010101001101011
But these ones and zeroes can also represent decimal numbers. First off, I will show you how to read these numbers as the decimal numbers we’re used to in our daily life. Then, I will show you how to use those numbers and your keypad to translate them into text. Note that your computer doesn’t use the decimal system, so technically, when it converts binary to text, it doesn’t go through the process I will show you. This is just a divertive way of explaining you how the binary system works.
The Binary System:
Here’s a simple example of binary:
10101
Let’s think of the example above as empty slots:
_ _ _ _ _
First off, you read binary from right-to-left. It’s just the way it’s designed. The first slot from the right represents a value of one, the second from the right a value of two, the third from the right a value of four, the fourth from the right a value of eight, the fifth from the right a value of sixteen, and the cycle continues by multiples of 2. This will never change.
By putting a 1 or a 0 in those slots you are either saying you want to corresponding value that’s attached to that slot or you don’t. A 1 means yes, and a 0 means no. For example, putting a zero in the first slot from the right, but a 1 in the second slot from the right means you want a two, but not a one:
_ _ _ 1 0
As such, the number above equals to a decimal value of two.
As an example, let’s say you want to represent eight in binary form. Well, thinking about the slots, you want the first slot to be 0 because you don’t want a one, you want the second slot to also be 0 because you don’t want a two, you want the third slot to also to be 0 because you don’t want a four, but you want the fifth slot to be 1 because you want a value of eight. As such, eight in binary form is:
1 0 0 0 (or simply 1000 without those underlines)
Now it is important to note that the amount of zeroes that precede the first value of one from the left is unimportant. So for example:
1 0 0 0 is the same as 0 0 0 1 0 0 0 (1000 = 000100)
To get it cleared up, here’s another example:
0 1 is the same as 1
Exercises: What do the following equal in decimal terms?
a) 100
b] 000100
c) 100000
d) 0010
Answers:
a) 4
b] 4
c) 32
d) 2
If you got the answers above right, then you pretty much understand the basics of binary.
Let’s now understand how to get the corresponding decimal values to the numbers which are not multiples of 2.
To get the total value of a binary number, add the values corresponding to each slot. So, for example, three in binary would be:
11
The above corresponds to three because if you add the total values of all the slots, that is to say a one from the slot to the right, and a two from the second slot to the right, then it equals three.
As another example, let’s say you want to represent 5 in binary terms. Then you would need a value of one to be added to a value of four, and you would not want a value of two:
101 [Reading from the right: 1(one) + 0(two) + 1(four) = five]
Here’s an additional example:
001011 [Reading from the right: 1(one) + 1(two) + 0(four) + 1(eight) + 0(sixteen) + 0(thirty-two) = eleven)
Exercises: What do the following equal in decimal terms?
a) 11011
b] 110
c) 010101
d) 10110
Answers:
a) 27
b] 6
c) 21
d) 22
If you got the above questions correct [without cheating], then you essentially understand the binary system. Understanding the binary system was the hard part. What follows is pretty easy.
3. Converting Binary to ASCII (Text)
ASCII is essentially the letters, numbers and symbols that are stored in our computers through the use of fonts. When the keyboard relays the buttons you pressed, it sends in a code which is then converted to the ASCII equivalent of “k” or “5” or whatever key you pressed.
Here’s an example of a message “hidden” in binary text:
0100100001100101011011000110110001101111
Now there are only so many letters, numbers and symbols stored for ASCII. Having sets of 8 digits for their binary equivalent is more than enough to represent all of these letters and the like. As such, all strings that represent text like in the above are separated into bits of 8 for simplicity:
01001000 01100101 01101100 01101100 01101111
Okay, so our example message was separated into 8 digit strings. The decimal value for each of these strings in the example was calculated for you.
01001000 = 72
01100101 = 101
01101100 = 108
01101100 = 108
01101111 = 111
The result was 72,101,108,108,111. Now, there is something called the ASCII table. It essentially corresponds to the binary numbers from yore to the equivalent letters/symbols/numbers. But since we found the decimal values of these binary strings, we can use a major shortcut.
By pressing ALT + [The Number], you will get the ASCII equivalent of that number. For example, by pressing the ALT key and at then (while keeping it down) the numbers 72 in any text editor, you will get the corresponding “H” to show up.
Let’s do so for the entire example message:
72 = H
101 = e
108 = l
108 = l
111 = o
So the entire “hidden” message translates to “Hello”.
Exercise: Decode the following message
010000110110111101101110011001110111001001100001011101000111010101101100011000010111010001
101001011011110110111001110011 00100001
Hint: The first step on your way to decoding the message (separated into bytes for you)
01000011 01101111 01101110 01100111 01110010 01100001 01110100 01110101 01101100 01100001 01110100 01101001 01101111 01101110 01110011 00100001
PS. Please note that this is the information as I've come to understand it. As such, it's somewhat easier to understand, but it may not necessarily be accurate. In other words, if another source contradicts what has been indicated here, that source is probably right. This text was completely written up by me, with no other sources for aid. If you wish to distribute this text, feel free to do so, but I would appreciate it if you contacted me first.
Translating Binary to Text
Contents
1. Introduction
2. The Binary System
3. Converting Binary to ASCII (Text)
Introduction:
We’ve all seen binary code. We’ve come to think of them as a bunch of ones and zeroes in long strings…
010010101010101001101011
But these ones and zeroes can also represent decimal numbers. First off, I will show you how to read these numbers as the decimal numbers we’re used to in our daily life. Then, I will show you how to use those numbers and your keypad to translate them into text. Note that your computer doesn’t use the decimal system, so technically, when it converts binary to text, it doesn’t go through the process I will show you. This is just a divertive way of explaining you how the binary system works.
The Binary System:
Here’s a simple example of binary:
10101
Let’s think of the example above as empty slots:
_ _ _ _ _
First off, you read binary from right-to-left. It’s just the way it’s designed. The first slot from the right represents a value of one, the second from the right a value of two, the third from the right a value of four, the fourth from the right a value of eight, the fifth from the right a value of sixteen, and the cycle continues by multiples of 2. This will never change.
By putting a 1 or a 0 in those slots you are either saying you want to corresponding value that’s attached to that slot or you don’t. A 1 means yes, and a 0 means no. For example, putting a zero in the first slot from the right, but a 1 in the second slot from the right means you want a two, but not a one:
_ _ _ 1 0
As such, the number above equals to a decimal value of two.
As an example, let’s say you want to represent eight in binary form. Well, thinking about the slots, you want the first slot to be 0 because you don’t want a one, you want the second slot to also be 0 because you don’t want a two, you want the third slot to also to be 0 because you don’t want a four, but you want the fifth slot to be 1 because you want a value of eight. As such, eight in binary form is:
1 0 0 0 (or simply 1000 without those underlines)
Now it is important to note that the amount of zeroes that precede the first value of one from the left is unimportant. So for example:
1 0 0 0 is the same as 0 0 0 1 0 0 0 (1000 = 000100)
To get it cleared up, here’s another example:
0 1 is the same as 1
Exercises: What do the following equal in decimal terms?
a) 100
b] 000100
c) 100000
d) 0010
Answers:
a) 4
b] 4
c) 32
d) 2
If you got the answers above right, then you pretty much understand the basics of binary.
Let’s now understand how to get the corresponding decimal values to the numbers which are not multiples of 2.
To get the total value of a binary number, add the values corresponding to each slot. So, for example, three in binary would be:
11
The above corresponds to three because if you add the total values of all the slots, that is to say a one from the slot to the right, and a two from the second slot to the right, then it equals three.
As another example, let’s say you want to represent 5 in binary terms. Then you would need a value of one to be added to a value of four, and you would not want a value of two:
101 [Reading from the right: 1(one) + 0(two) + 1(four) = five]
Here’s an additional example:
001011 [Reading from the right: 1(one) + 1(two) + 0(four) + 1(eight) + 0(sixteen) + 0(thirty-two) = eleven)
Exercises: What do the following equal in decimal terms?
a) 11011
b] 110
c) 010101
d) 10110
Answers:
a) 27
b] 6
c) 21
d) 22
If you got the above questions correct [without cheating], then you essentially understand the binary system. Understanding the binary system was the hard part. What follows is pretty easy.
3. Converting Binary to ASCII (Text)
ASCII is essentially the letters, numbers and symbols that are stored in our computers through the use of fonts. When the keyboard relays the buttons you pressed, it sends in a code which is then converted to the ASCII equivalent of “k” or “5” or whatever key you pressed.
Here’s an example of a message “hidden” in binary text:
0100100001100101011011000110110001101111
Now there are only so many letters, numbers and symbols stored for ASCII. Having sets of 8 digits for their binary equivalent is more than enough to represent all of these letters and the like. As such, all strings that represent text like in the above are separated into bits of 8 for simplicity:
01001000 01100101 01101100 01101100 01101111
Okay, so our example message was separated into 8 digit strings. The decimal value for each of these strings in the example was calculated for you.
01001000 = 72
01100101 = 101
01101100 = 108
01101100 = 108
01101111 = 111
The result was 72,101,108,108,111. Now, there is something called the ASCII table. It essentially corresponds to the binary numbers from yore to the equivalent letters/symbols/numbers. But since we found the decimal values of these binary strings, we can use a major shortcut.
By pressing ALT + [The Number], you will get the ASCII equivalent of that number. For example, by pressing the ALT key and at then (while keeping it down) the numbers 72 in any text editor, you will get the corresponding “H” to show up.
Let’s do so for the entire example message:
72 = H
101 = e
108 = l
108 = l
111 = o
So the entire “hidden” message translates to “Hello”.
Exercise: Decode the following message
01000011011011110110111001100111011100100110000101110100011101010110110001100001
011101000
1101001011011110110111001110011 00100001
Hint: The first step on your way to decoding the message (separated into bytes for you)
01000011 01101111 01101110 01100111 01110010 01100001 01110100 01110101 01101100 01100001 01110100 01101001 01101111 01101110 01110011 00100001
A Tutorial for those willing to Learn
Contents
1. Introduction
2. The Binary System
3. Converting Binary to ASCII (Text)
Introduction:
We’ve all seen binary code. We’ve come to think of them as a bunch of ones and zeroes in long strings…
010010101010101001101011
But these ones and zeroes can also represent decimal numbers. First off, I will show you how to read these numbers as the decimal numbers we’re used to in our daily life. Then, I will show you how to use those numbers and your keypad to translate them into text. Note that your computer doesn’t use the decimal system, so technically, when it converts binary to text, it doesn’t go through the process I will show you. This is just a divertive way of explaining you how the binary system works.
The Binary System:
Here’s a simple example of binary:
10101
Let’s think of the example above as empty slots:
_ _ _ _ _
First off, you read binary from right-to-left. It’s just the way it’s designed. The first slot from the right represents a value of one, the second from the right a value of two, the third from the right a value of four, the fourth from the right a value of eight, the fifth from the right a value of sixteen, and the cycle continues by multiples of 2. This will never change.
By putting a 1 or a 0 in those slots you are either saying you want to corresponding value that’s attached to that slot or you don’t. A 1 means yes, and a 0 means no. For example, putting a zero in the first slot from the right, but a 1 in the second slot from the right means you want a two, but not a one:
_ _ _ 1 0
As such, the number above equals to a decimal value of two.
As an example, let’s say you want to represent eight in binary form. Well, thinking about the slots, you want the first slot to be 0 because you don’t want a one, you want the second slot to also be 0 because you don’t want a two, you want the third slot to also to be 0 because you don’t want a four, but you want the fifth slot to be 1 because you want a value of eight. As such, eight in binary form is:
1 0 0 0 (or simply 1000 without those underlines)
Now it is important to note that the amount of zeroes that precede the first value of one from the left is unimportant. So for example:
1 0 0 0 is the same as 0 0 0 1 0 0 0 (1000 = 000100)
To get it cleared up, here’s another example:
0 1 is the same as 1
Exercises: What do the following equal in decimal terms?
a) 100
b] 000100
c) 100000
d) 0010
Answers:
a) 4
b] 4
c) 32
d) 2
If you got the answers above right, then you pretty much understand the basics of binary.
Let’s now understand how to get the corresponding decimal values to the numbers which are not multiples of 2.
To get the total value of a binary number, add the values corresponding to each slot. So, for example, three in binary would be:
11
The above corresponds to three because if you add the total values of all the slots, that is to say a one from the slot to the right, and a two from the second slot to the right, then it equals three.
As another example, let’s say you want to represent 5 in binary terms. Then you would need a value of one to be added to a value of four, and you would not want a value of two:
101 [Reading from the right: 1(one) + 0(two) + 1(four) = five]
Here’s an additional example:
001011 [Reading from the right: 1(one) + 1(two) + 0(four) + 1(eight) + 0(sixteen) + 0(thirty-two) = eleven)
Exercises: What do the following equal in decimal terms?
a) 11011
b] 110
c) 010101
d) 10110
Answers:
a) 27
b] 6
c) 21
d) 22
If you got the above questions correct [without cheating], then you essentially understand the binary system. Understanding the binary system was the hard part. What follows is pretty easy.
3. Converting Binary to ASCII (Text)
ASCII is essentially the letters, numbers and symbols that are stored in our computers through the use of fonts. When the keyboard relays the buttons you pressed, it sends in a code which is then converted to the ASCII equivalent of “k” or “5” or whatever key you pressed.
Here’s an example of a message “hidden” in binary text:
0100100001100101011011000110110001101111
Now there are only so many letters, numbers and symbols stored for ASCII. Having sets of 8 digits for their binary equivalent is more than enough to represent all of these letters and the like. As such, all strings that represent text like in the above are separated into bits of 8 for simplicity:
01001000 01100101 01101100 01101100 01101111
Okay, so our example message was separated into 8 digit strings. The decimal value for each of these strings in the example was calculated for you.
01001000 = 72
01100101 = 101
01101100 = 108
01101100 = 108
01101111 = 111
The result was 72,101,108,108,111. Now, there is something called the ASCII table. It essentially corresponds to the binary numbers from yore to the equivalent letters/symbols/numbers. But since we found the decimal values of these binary strings, we can use a major shortcut.
By pressing ALT + [The Number], you will get the ASCII equivalent of that number. For example, by pressing the ALT key and at then (while keeping it down) the numbers 72 in any text editor, you will get the corresponding “H” to show up.
Let’s do so for the entire example message:
72 = H
101 = e
108 = l
108 = l
111 = o
So the entire “hidden” message translates to “Hello”.
Exercise: Decode the following message
010000110110111101101110011001110111001001100001011101000111010101101100011000010111010001
101001011011110110111001110011 00100001
Hint: The first step on your way to decoding the message (separated into bytes for you)
01000011 01101111 01101110 01100111 01110010 01100001 01110100 01110101 01101100 01100001 01110100 01101001 01101111 01101110 01110011 00100001
PS. Please note that this is the information as I've come to understand it. As such, it's somewhat easier to understand, but it may not necessarily be accurate. In other words, if another source contradicts what has been indicated here, that source is probably right. This text was completely written up by me, with no other sources for aid. If you wish to distribute this text, feel free to do so, but I would appreciate it if you contacted me first.
Translating Binary to Text
Contents
1. Introduction
2. The Binary System
3. Converting Binary to ASCII (Text)
Introduction:
We’ve all seen binary code. We’ve come to think of them as a bunch of ones and zeroes in long strings…
010010101010101001101011
But these ones and zeroes can also represent decimal numbers. First off, I will show you how to read these numbers as the decimal numbers we’re used to in our daily life. Then, I will show you how to use those numbers and your keypad to translate them into text. Note that your computer doesn’t use the decimal system, so technically, when it converts binary to text, it doesn’t go through the process I will show you. This is just a divertive way of explaining you how the binary system works.
The Binary System:
Here’s a simple example of binary:
10101
Let’s think of the example above as empty slots:
_ _ _ _ _
First off, you read binary from right-to-left. It’s just the way it’s designed. The first slot from the right represents a value of one, the second from the right a value of two, the third from the right a value of four, the fourth from the right a value of eight, the fifth from the right a value of sixteen, and the cycle continues by multiples of 2. This will never change.
By putting a 1 or a 0 in those slots you are either saying you want to corresponding value that’s attached to that slot or you don’t. A 1 means yes, and a 0 means no. For example, putting a zero in the first slot from the right, but a 1 in the second slot from the right means you want a two, but not a one:
_ _ _ 1 0
As such, the number above equals to a decimal value of two.
As an example, let’s say you want to represent eight in binary form. Well, thinking about the slots, you want the first slot to be 0 because you don’t want a one, you want the second slot to also be 0 because you don’t want a two, you want the third slot to also to be 0 because you don’t want a four, but you want the fifth slot to be 1 because you want a value of eight. As such, eight in binary form is:
1 0 0 0 (or simply 1000 without those underlines)
Now it is important to note that the amount of zeroes that precede the first value of one from the left is unimportant. So for example:
1 0 0 0 is the same as 0 0 0 1 0 0 0 (1000 = 000100)
To get it cleared up, here’s another example:
0 1 is the same as 1
Exercises: What do the following equal in decimal terms?
a) 100
b] 000100
c) 100000
d) 0010
Answers:
a) 4
b] 4
c) 32
d) 2
If you got the answers above right, then you pretty much understand the basics of binary.
Let’s now understand how to get the corresponding decimal values to the numbers which are not multiples of 2.
To get the total value of a binary number, add the values corresponding to each slot. So, for example, three in binary would be:
11
The above corresponds to three because if you add the total values of all the slots, that is to say a one from the slot to the right, and a two from the second slot to the right, then it equals three.
As another example, let’s say you want to represent 5 in binary terms. Then you would need a value of one to be added to a value of four, and you would not want a value of two:
101 [Reading from the right: 1(one) + 0(two) + 1(four) = five]
Here’s an additional example:
001011 [Reading from the right: 1(one) + 1(two) + 0(four) + 1(eight) + 0(sixteen) + 0(thirty-two) = eleven)
Exercises: What do the following equal in decimal terms?
a) 11011
b] 110
c) 010101
d) 10110
Answers:
a) 27
b] 6
c) 21
d) 22
If you got the above questions correct [without cheating], then you essentially understand the binary system. Understanding the binary system was the hard part. What follows is pretty easy.
3. Converting Binary to ASCII (Text)
ASCII is essentially the letters, numbers and symbols that are stored in our computers through the use of fonts. When the keyboard relays the buttons you pressed, it sends in a code which is then converted to the ASCII equivalent of “k” or “5” or whatever key you pressed.
Here’s an example of a message “hidden” in binary text:
0100100001100101011011000110110001101111
Now there are only so many letters, numbers and symbols stored for ASCII. Having sets of 8 digits for their binary equivalent is more than enough to represent all of these letters and the like. As such, all strings that represent text like in the above are separated into bits of 8 for simplicity:
01001000 01100101 01101100 01101100 01101111
Okay, so our example message was separated into 8 digit strings. The decimal value for each of these strings in the example was calculated for you.
01001000 = 72
01100101 = 101
01101100 = 108
01101100 = 108
01101111 = 111
The result was 72,101,108,108,111. Now, there is something called the ASCII table. It essentially corresponds to the binary numbers from yore to the equivalent letters/symbols/numbers. But since we found the decimal values of these binary strings, we can use a major shortcut.
By pressing ALT + [The Number], you will get the ASCII equivalent of that number. For example, by pressing the ALT key and at then (while keeping it down) the numbers 72 in any text editor, you will get the corresponding “H” to show up.
Let’s do so for the entire example message:
72 = H
101 = e
108 = l
108 = l
111 = o
So the entire “hidden” message translates to “Hello”.
Exercise: Decode the following message
01000011011011110110111001100111011100100110000101110100011101010110110001100001
011101000
1101001011011110110111001110011 00100001
Hint: The first step on your way to decoding the message (separated into bytes for you)
01000011 01101111 01101110 01100111 01110010 01100001 01110100 01110101 01101100 01100001 01110100 01101001 01101111 01101110 01110011 00100001
What Is The Regsitry - Windows Operating Systems
What is the Registry?
The Registry is a database used to store settings and options for the 32 bit versions of Microsoft Windows including Windows 95, 98, ME and NT/2000. It contains information and settings for all the hardware, software, users, and preferences of the PC. Whenever a user makes changes to a Control Panel settings, or File Associations, System Policies, or installed software, the changes are reflected and stored in the Registry.
The physical files that make up the registry are stored differently depending on your version of Windows; under Windows 95 & 98 it is contained in two hidden files in your Windows directory, called USER.DAT and SYSTEM.DAT, for Windows Me there is an additional CLASSES.DAT file, while under Windows NT/2000 the files are contained seperately in the %SystemRoot%\System32\Config directory. You can not edit these files directly, you must use a tool commonly known as a "Registry Editor" to make any changes (using registry editors will be discussed later in the article).
The Structure of The Registry
The Registry has a hierarchal structure, although it looks complicated the structure is similar to the directory structure on your hard disk, with Regedit being similar to Windows Explorer.
Each main branch (denoted by a folder icon in the Registry Editor, see left) is called a Hive, and Hives contains Keys. Each key can contain other keys (sometimes referred to as sub-keys), as well as Values. The values contain the actual information stored in the Registry. There are three types of values; String, Binary, and DWORD - the use of these depends upon the context.
There are six main branches, each containing a specific portion of the information stored in the Registry. They are as follows:
* HKEY_CLASSES_ROOT - This branch contains all of your file association mappings to support the drag-and-drop feature, OLE information, Windows shortcuts, and core aspects of the Windows user interface.
* HKEY_CURRENT_USER - This branch links to the section of HKEY_USERS appropriate for the user currently logged onto the PC and contains information such as logon names, desktop settings, and Start menu settings.
* HKEY_LOCAL_MACHINE - This branch contains computer specific information about the type of hardware, software, and other preferences on a given PC, this information is used for all users who log onto this computer.
* HKEY_USERS - This branch contains individual preferences for each user of the computer, each user is represented by a SID sub-key located under the main branch.
* HKEY_CURRENT_CONFIG - This branch links to the section of HKEY_LOCAL_MACHINE appropriate for the current hardware configuration.
* HKEY_DYN_DATA - This branch points to the part of HKEY_LOCAL_MACHINE, for use with the Plug-&-Play features of Windows, this section is dymanic and will change as devices are added and removed from the system.
Each registry value is stored as one of five main data types:
* REG_BINARY - This type stores the value as raw binary data. Most hardware component information is stored as binary data, and can be displayed in an editor in hexadecimal format.
* REG_DWORD - This type represents the data by a four byte number and is commonly used for boolean values, such as "0" is disabled and "1" is enabled. Additionally many parameters for device driver and services are this type, and can be displayed in REGEDT32 in binary, hexadecimal and decimal format, or in REGEDIT in hexadecimal and decimal format.
* REG_EXPAND_SZ - This type is an expandable data string that is string containing a variable to be replaced when called by an application. For example, for the following value, the string "%SystemRoot%" will replaced by the actual location of the directory containing the Windows NT system files. (This type is only available using an advanced registry editor such as REGEDT32)
* REG_MULTI_SZ - This type is a multiple string used to represent values that contain lists or multiple values, each entry is separated by a NULL character. (This type is only available using an advanced registry editor such as REGEDT32)
* REG_SZ - This type is a standard string, used to represent human readable text values.
Other data types not available through the standard registry editors include:
* REG_DWORD_LITTLE_ENDIAN - A 32-bit number in little-endian format.
* REG_DWORD_BIG_ENDIAN - A 32-bit number in big-endian format.
* REG_LINK - A Unicode symbolic link. Used internally; applications should not use this type.
* REG_NONE - No defined value type.
* REG_QWORD - A 64-bit number.
* REG_QWORD_LITTLE_ENDIAN - A 64-bit number in little-endian format.
* REG_RESOURCE_LIST - A device-driver resource list.
Editing The Registry
The Registry Editor (REGEDIT.EXE) is included with most version of Windows (although you won't find it on the Start Menu) it enables you to view, search and edit the data within the Registry. There are several methods for starting the Registry Editor, the simplest is to click on the Start button, then select Run, and in the Open box type "regedit", and if the Registry Editor is installed it should now open and look like the image below.
An alternative Registry Editor (REGEDT32.EXE) is available for use with Windows NT/2000, it includes some additional features not found in the standard version, including; the ability to view and modify security permissions, and being able to create and modify the extended string values REG_EXPAND_SZ & REG_MULTI_SZ.
Create a Shortcut to Regedit
This can be done by simply right-clicking on a blank area of your desktop, selecting New, then Shortcut, then in the Command line box enter "regedit.exe" and click Next, enter a friendly name (e.g. 'Registry Editor') then click Finish and now you can double click on the new icon to launch the Registry Editor.
Using Regedit to modify your Registry
Once you have started the Regedit you will notice that on the left side there is a tree with folders, and on the right the contents (values) of the currently selected folder.
Like Windows explorer, to expand a certain branch (see the structure of the registry section), click on the plus sign [+] to the left of any folder, or just double-click on the folder. To display the contents of a key (folder), just click the desired key, and look at the values listed on the right side. You can add a new key or value by selecting New from the Edit menu, or by right-clicking your mouse. And you can rename any value and almost any key with the same method used to rename files; right-click on an object and click rename, or click on it twice (slowly), or just press F2 on the keyboard. Lastly, you can delete a key or value by clicking on it, and pressing Delete on the keyboard, or by right-clicking on it, and choosing Delete.
Note: it is always a good idea to backup your registry before making any changes to it. It can be intimidating to a new user, and there is always the possibility of changing or deleting a critical setting causing you to have to reinstall the whole operating system. It's much better to be safe than sorry!
Importing and Exporting Registry Settings
A great feature of the Registry Editor is it's ability to import and export registry settings to a text file, this text file, identified by the .REG extension, can then be saved or shared with other people to easily modify local registry settings. You can see the layout of these text files by simply exporting a key to a file and opening it in Notepad, to do this using the Registry Editor select a key, then from the "Registry" menu choose "Export Registry File...", choose a filename and save. If you open this file in notepad you will see a file similar to the example below:
Quote:
REGEDIT4
[HKEY_LOCAL_MACHINE\SYSTEM\Setup]
"SetupType"=dword:00000000
"CmdLine"="setup -newsetup"
"SystemPrefix"=hex:c5,0b,00,00,00,40,36,02
The layout is quite simple, REGEDIT4 indicated the file type and version, [HKEY_LOCAL_MACHINE\SYSTEM\Setup] indicated the key the values are from, "SetupType"=dword:00000000 are the values themselves the portion after the "=" will vary depending on the type of value they are; DWORD, String or Binary.
So by simply editing this file to make the changes you want, it can then be easily distributed and all that need to be done is to double-click, or choose "Import" from the Registry menu, for the settings to be added to the system Registry.
Deleting keys or values using a REG file
It is also possible to delete keys and values using REG files. To delete a key start by using the same format as the the REG file above, but place a "-" symbol in front of the key name you want to delete. For example to delete the [HKEY_LOCAL_MACHINE\SYSTEM\Setup] key the reg file would look like this:
Quote:
REGEDIT4
[-HKEY_LOCAL_MACHINE\SYSTEM\Setup]
The format used to delete individual values is similar, but instead of a minus sign in front of the whole key, place it after the equal sign of the value. For example, to delete the value "SetupType" the file would look like:
Quote:
REGEDIT4
[HKEY_LOCAL_MACHINE\SYSTEM\Setup]
"SetupType"=-
Use this feature with care, as deleting the wrong key or value could cause major problems within the registry, so remember to always make a backup first.
Regedit Command Line Options
Regedit has a number of command line options to help automate it's use in either batch files or from the command prompt. Listed below are some of the options, please note the some of the functions are operating system specific.
* regedit.exe [options] [filename] [regpath]
* [filename] Import .reg file into the registry
* /s [filename] Silent import, i.e. hide confirmation box when importing files
* /e [filename] [regpath] Export the registry to [filename] starting at [regpath]
e.g. regedit /e file.reg HKEY_USERS\.DEFAULT
* /L:system Specify the location of the system.dat to use
* /R:user Specify the location of the user.dat to use
* /C [filename] Compress (Windows 98)
* /D [regpath] Delete the specified key (Windows 98)
Maintaining the Registry
How can you backup and restore the Registry?
Windows 95
Microsoft included a utility on the Windows 95 CD-ROM that lets you create backups of the Registry on your computer. The Microsoft Configuration Backup program, CFGBACK.EXE, can be found in the \Other\Misc\Cfgback directory on the Windows 95 CD-ROM. This utility lets you create up to nine different backup copies of the Registry, which it stores, with the extension RBK, in your \Windows directory. If your system is set up for multiple users, CFGBACK.EXE won't back up the USER.DAT file.
After you have backed up your Registry, you can copy the RBK file onto a floppy disk for safekeeping. However, to restore from a backup, the RBK file must reside in the \Windows directory. Windows 95 stores the backups in compressed form, which you can then restore only by using the CFGBACK.EXE utility.
Windows 98
Microsoft Windows 98 automatically creates a backup copy of the registry every time Windows starts, in addition to this you can manually create a backup using the Registry Checker utility by running SCANREGW.EXE from Start | Run menu.
What to do if you get a Corrupted Registry
Windows 95, 98 and NT all have a simple registry backup mechanism that is quite reliable, although you should never simply rely on it, remember to always make a backup first!
Windows 95
In the Windows directory there are several hidden files, four of these will be SYSTEM.DAT & USER.DAT, your current registry, and SYSTEM.DA0 & USER.DA0, a backup of your registry. Windows 9x has a nice reature in that every time it appears to start successfully it will copy the registry over these backup files, so just in case something goes wrong can can restore it to a known good state. To restore the registry follow these instruction:
[list=1]
* Click the Start button, and then click Shut Down.
* Click Restart The Computer In MS-DOS Mode, then click Yes.
* Change to your Windows directory. For example, if your Windows directory is c:\windows, you would type the following:
cd c:\windows
* Type the following commands, pressing ENTER after each one. (Note that SYSTEM.DA0 and USER.DA0 contain the number zero.)
attrib -h -r -s system.dat
attrib -h -r -s system.da0
copy system.da0 system.dat
attrib -h -r -s user.dat
attrib -h -r -s user.da0
copy user.da0 user.dat
* Restart your computer.
Following this procedure will restore your registry to its state when you last successfully started your computer.
If all else fails, there is a file on your hard disk named SYSTEM.1ST that was created when Windows 95 was first successfully installed. If necessary you could also change the file attributes of this file from read-only and hidden to archive to copy the file to C:\WINDOWS\SYSTEM.DAT.
Windows NT
On Windows NT you can use either the "Last Known Good" option or RDISK to restore to registry to a stable working configuration.
How can I clean out old data from the Registry?
Although it's possible to manually go through the Registry and delete unwanted entries, Microsoft provides a tool to automate the process, the program is called RegClean. RegClean analyzes Windows Registry keys stored in a common location in the Windows Registry. It finds keys that contain erroneous values, it removes them from the Windows Registry after having recording those entries in the Undo.Reg file.
The Registry is a database used to store settings and options for the 32 bit versions of Microsoft Windows including Windows 95, 98, ME and NT/2000. It contains information and settings for all the hardware, software, users, and preferences of the PC. Whenever a user makes changes to a Control Panel settings, or File Associations, System Policies, or installed software, the changes are reflected and stored in the Registry.
The physical files that make up the registry are stored differently depending on your version of Windows; under Windows 95 & 98 it is contained in two hidden files in your Windows directory, called USER.DAT and SYSTEM.DAT, for Windows Me there is an additional CLASSES.DAT file, while under Windows NT/2000 the files are contained seperately in the %SystemRoot%\System32\Config directory. You can not edit these files directly, you must use a tool commonly known as a "Registry Editor" to make any changes (using registry editors will be discussed later in the article).
The Structure of The Registry
The Registry has a hierarchal structure, although it looks complicated the structure is similar to the directory structure on your hard disk, with Regedit being similar to Windows Explorer.
Each main branch (denoted by a folder icon in the Registry Editor, see left) is called a Hive, and Hives contains Keys. Each key can contain other keys (sometimes referred to as sub-keys), as well as Values. The values contain the actual information stored in the Registry. There are three types of values; String, Binary, and DWORD - the use of these depends upon the context.
There are six main branches, each containing a specific portion of the information stored in the Registry. They are as follows:
* HKEY_CLASSES_ROOT - This branch contains all of your file association mappings to support the drag-and-drop feature, OLE information, Windows shortcuts, and core aspects of the Windows user interface.
* HKEY_CURRENT_USER - This branch links to the section of HKEY_USERS appropriate for the user currently logged onto the PC and contains information such as logon names, desktop settings, and Start menu settings.
* HKEY_LOCAL_MACHINE - This branch contains computer specific information about the type of hardware, software, and other preferences on a given PC, this information is used for all users who log onto this computer.
* HKEY_USERS - This branch contains individual preferences for each user of the computer, each user is represented by a SID sub-key located under the main branch.
* HKEY_CURRENT_CONFIG - This branch links to the section of HKEY_LOCAL_MACHINE appropriate for the current hardware configuration.
* HKEY_DYN_DATA - This branch points to the part of HKEY_LOCAL_MACHINE, for use with the Plug-&-Play features of Windows, this section is dymanic and will change as devices are added and removed from the system.
Each registry value is stored as one of five main data types:
* REG_BINARY - This type stores the value as raw binary data. Most hardware component information is stored as binary data, and can be displayed in an editor in hexadecimal format.
* REG_DWORD - This type represents the data by a four byte number and is commonly used for boolean values, such as "0" is disabled and "1" is enabled. Additionally many parameters for device driver and services are this type, and can be displayed in REGEDT32 in binary, hexadecimal and decimal format, or in REGEDIT in hexadecimal and decimal format.
* REG_EXPAND_SZ - This type is an expandable data string that is string containing a variable to be replaced when called by an application. For example, for the following value, the string "%SystemRoot%" will replaced by the actual location of the directory containing the Windows NT system files. (This type is only available using an advanced registry editor such as REGEDT32)
* REG_MULTI_SZ - This type is a multiple string used to represent values that contain lists or multiple values, each entry is separated by a NULL character. (This type is only available using an advanced registry editor such as REGEDT32)
* REG_SZ - This type is a standard string, used to represent human readable text values.
Other data types not available through the standard registry editors include:
* REG_DWORD_LITTLE_ENDIAN - A 32-bit number in little-endian format.
* REG_DWORD_BIG_ENDIAN - A 32-bit number in big-endian format.
* REG_LINK - A Unicode symbolic link. Used internally; applications should not use this type.
* REG_NONE - No defined value type.
* REG_QWORD - A 64-bit number.
* REG_QWORD_LITTLE_ENDIAN - A 64-bit number in little-endian format.
* REG_RESOURCE_LIST - A device-driver resource list.
Editing The Registry
The Registry Editor (REGEDIT.EXE) is included with most version of Windows (although you won't find it on the Start Menu) it enables you to view, search and edit the data within the Registry. There are several methods for starting the Registry Editor, the simplest is to click on the Start button, then select Run, and in the Open box type "regedit", and if the Registry Editor is installed it should now open and look like the image below.
An alternative Registry Editor (REGEDT32.EXE) is available for use with Windows NT/2000, it includes some additional features not found in the standard version, including; the ability to view and modify security permissions, and being able to create and modify the extended string values REG_EXPAND_SZ & REG_MULTI_SZ.
Create a Shortcut to Regedit
This can be done by simply right-clicking on a blank area of your desktop, selecting New, then Shortcut, then in the Command line box enter "regedit.exe" and click Next, enter a friendly name (e.g. 'Registry Editor') then click Finish and now you can double click on the new icon to launch the Registry Editor.
Using Regedit to modify your Registry
Once you have started the Regedit you will notice that on the left side there is a tree with folders, and on the right the contents (values) of the currently selected folder.
Like Windows explorer, to expand a certain branch (see the structure of the registry section), click on the plus sign [+] to the left of any folder, or just double-click on the folder. To display the contents of a key (folder), just click the desired key, and look at the values listed on the right side. You can add a new key or value by selecting New from the Edit menu, or by right-clicking your mouse. And you can rename any value and almost any key with the same method used to rename files; right-click on an object and click rename, or click on it twice (slowly), or just press F2 on the keyboard. Lastly, you can delete a key or value by clicking on it, and pressing Delete on the keyboard, or by right-clicking on it, and choosing Delete.
Note: it is always a good idea to backup your registry before making any changes to it. It can be intimidating to a new user, and there is always the possibility of changing or deleting a critical setting causing you to have to reinstall the whole operating system. It's much better to be safe than sorry!
Importing and Exporting Registry Settings
A great feature of the Registry Editor is it's ability to import and export registry settings to a text file, this text file, identified by the .REG extension, can then be saved or shared with other people to easily modify local registry settings. You can see the layout of these text files by simply exporting a key to a file and opening it in Notepad, to do this using the Registry Editor select a key, then from the "Registry" menu choose "Export Registry File...", choose a filename and save. If you open this file in notepad you will see a file similar to the example below:
Quote:
REGEDIT4
[HKEY_LOCAL_MACHINE\SYSTEM\Setup]
"SetupType"=dword:00000000
"CmdLine"="setup -newsetup"
"SystemPrefix"=hex:c5,0b,00,00,00,40,36,02
The layout is quite simple, REGEDIT4 indicated the file type and version, [HKEY_LOCAL_MACHINE\SYSTEM\Setup] indicated the key the values are from, "SetupType"=dword:00000000 are the values themselves the portion after the "=" will vary depending on the type of value they are; DWORD, String or Binary.
So by simply editing this file to make the changes you want, it can then be easily distributed and all that need to be done is to double-click, or choose "Import" from the Registry menu, for the settings to be added to the system Registry.
Deleting keys or values using a REG file
It is also possible to delete keys and values using REG files. To delete a key start by using the same format as the the REG file above, but place a "-" symbol in front of the key name you want to delete. For example to delete the [HKEY_LOCAL_MACHINE\SYSTEM\Setup] key the reg file would look like this:
Quote:
REGEDIT4
[-HKEY_LOCAL_MACHINE\SYSTEM\Setup]
The format used to delete individual values is similar, but instead of a minus sign in front of the whole key, place it after the equal sign of the value. For example, to delete the value "SetupType" the file would look like:
Quote:
REGEDIT4
[HKEY_LOCAL_MACHINE\SYSTEM\Setup]
"SetupType"=-
Use this feature with care, as deleting the wrong key or value could cause major problems within the registry, so remember to always make a backup first.
Regedit Command Line Options
Regedit has a number of command line options to help automate it's use in either batch files or from the command prompt. Listed below are some of the options, please note the some of the functions are operating system specific.
* regedit.exe [options] [filename] [regpath]
* [filename] Import .reg file into the registry
* /s [filename] Silent import, i.e. hide confirmation box when importing files
* /e [filename] [regpath] Export the registry to [filename] starting at [regpath]
e.g. regedit /e file.reg HKEY_USERS\.DEFAULT
* /L:system Specify the location of the system.dat to use
* /R:user Specify the location of the user.dat to use
* /C [filename] Compress (Windows 98)
* /D [regpath] Delete the specified key (Windows 98)
Maintaining the Registry
How can you backup and restore the Registry?
Windows 95
Microsoft included a utility on the Windows 95 CD-ROM that lets you create backups of the Registry on your computer. The Microsoft Configuration Backup program, CFGBACK.EXE, can be found in the \Other\Misc\Cfgback directory on the Windows 95 CD-ROM. This utility lets you create up to nine different backup copies of the Registry, which it stores, with the extension RBK, in your \Windows directory. If your system is set up for multiple users, CFGBACK.EXE won't back up the USER.DAT file.
After you have backed up your Registry, you can copy the RBK file onto a floppy disk for safekeeping. However, to restore from a backup, the RBK file must reside in the \Windows directory. Windows 95 stores the backups in compressed form, which you can then restore only by using the CFGBACK.EXE utility.
Windows 98
Microsoft Windows 98 automatically creates a backup copy of the registry every time Windows starts, in addition to this you can manually create a backup using the Registry Checker utility by running SCANREGW.EXE from Start | Run menu.
What to do if you get a Corrupted Registry
Windows 95, 98 and NT all have a simple registry backup mechanism that is quite reliable, although you should never simply rely on it, remember to always make a backup first!
Windows 95
In the Windows directory there are several hidden files, four of these will be SYSTEM.DAT & USER.DAT, your current registry, and SYSTEM.DA0 & USER.DA0, a backup of your registry. Windows 9x has a nice reature in that every time it appears to start successfully it will copy the registry over these backup files, so just in case something goes wrong can can restore it to a known good state. To restore the registry follow these instruction:
[list=1]
* Click the Start button, and then click Shut Down.
* Click Restart The Computer In MS-DOS Mode, then click Yes.
* Change to your Windows directory. For example, if your Windows directory is c:\windows, you would type the following:
cd c:\windows
* Type the following commands, pressing ENTER after each one. (Note that SYSTEM.DA0 and USER.DA0 contain the number zero.)
attrib -h -r -s system.dat
attrib -h -r -s system.da0
copy system.da0 system.dat
attrib -h -r -s user.dat
attrib -h -r -s user.da0
copy user.da0 user.dat
* Restart your computer.
Following this procedure will restore your registry to its state when you last successfully started your computer.
If all else fails, there is a file on your hard disk named SYSTEM.1ST that was created when Windows 95 was first successfully installed. If necessary you could also change the file attributes of this file from read-only and hidden to archive to copy the file to C:\WINDOWS\SYSTEM.DAT.
Windows NT
On Windows NT you can use either the "Last Known Good" option or RDISK to restore to registry to a stable working configuration.
How can I clean out old data from the Registry?
Although it's possible to manually go through the Registry and delete unwanted entries, Microsoft provides a tool to automate the process, the program is called RegClean. RegClean analyzes Windows Registry keys stored in a common location in the Windows Registry. It finds keys that contain erroneous values, it removes them from the Windows Registry after having recording those entries in the Undo.Reg file.
Subscribe to:
Posts (Atom)