TUN/TAP interface (on Linux)
Published:
Updated:
Some notes about using the TUN/TAP interface, especially on Linux.
Table of content
Introduction
TUN/TAP is an operating-system interface for creating network interfaces managed by userspace. This is usually used to implement userspace Virtual Private Networks[1] (VPNs), for example with OpenVPN, OpenSSH (Tunnel
configuration or -w
argument), l2tpns, etc. This interface is exposed through the /dev/net/tun
device file and related ioctls (TUNSETIFF
, etc.).
Basic Usage
This C code snippet creates a virtual network interface tun0
controlled by the program:
int fd = open("/dev/net/tun", O_RDWR);
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TUN | IFF_NO_PI;
strncpy(ifr.ifr_name, "tun0", IFNAMSIZ);
ioctl(fd, TUNSETIFF, &ifr);
Note: I've omitted error handling for brevity in the code snippets. In real programs, you should check the result of each operation.
The program first opens the /dev/net/tun
file: each file descriptor obtained this way can be used by the process to communicate with one virtual network device.
Then, we use the TUNSETIFF
ioctl()
in order to configure the network interface associated with this file descriptor:
- we can use the
ifr_flags
field to choose whether to create a TUN (i.e. IP) or a TAP (i.e. Ethernet) interface (explained in the next section); - we can use the
ifr_flags
field to set additional options as well (such as theIFF_NO_PI
flag, explained afterwards); - if the
ifr_name
field is not empty, the call either create a new virtual network interface with the chosen name or an existing one (depending on whether a network interface with this name already exists); - if the
ifr_name
fields is empty, the system creates a new interface, generated a name for it and stored this name in theifr.ifr_name
field.
After this, the program can communicate with the virtual network interface using the file descriptor:
write()
sends a (single) packet or frame to the virtual network interface;read()
receives a (single) packet or frame from the virtual network interface;select()
works as expected.
When the the program closes the last file descriptor associated with the TUN/TAP interface, the systems destroys the interface. We can prevent the system from destroying the interface by making it made persistent with the TUNSETPERSIST
ioctl()
.
TUN vs. TAP
There are two types of virtual network interfaces managed by /dev/net/tun
:
- TUN interfaces transport IP packets (layer 3);
- TAP interfaces transport Ethernet frames (layer 2).
TUN interfaces (L3)
TUN interfaces (IFF_TUN
) transport layer 3 (L3) Protocol Data Units (PDUs):
- in practice, it transports IPv4 and/or IPv6 packets;
read()
gets a L3 PDU (an IP packet);- you must
write()
L3 PDUs (an IP packet); - there is no layer 2 (Ethernet, etc.) involved in the interface;
- they are
POINTOPOINT
interfaces[2].
If we have set IFF_NO_PI
in the ifr_flags
field, the version of the IP protocol (IPv4 or IPv6) is deduced from the IP version number in the packet. If IFF_NO_PI
is not set, 4 bytes (struct tun_pi
) are prepended to each packet (see below for details).
TAP interfaces (L2)
TAP interfaces (IFF_TUN
) transport layer 2 (L2) PDUs:
- in practice, it transports Ethernet frames (i.e. this is a virtual Ethernet adapter);
read()
gets a L2 PDU;- you must
write()
L2 PDUs.
By default, the TAP interface is an Ethernet interfaces but we ca n change the type of device with the TUNSETLINK
ioctl()
. For example, for we can request a virtual ATM network interface with:
ioctl(fd, TUNSETLINK, ARPHRD_ATM);
which gives:
5: tap0:mtu 1500 qdisc noop state DOWN group default qlen 500 link/atm 12:07:d9:23:19:7d brd ff:ff:ff:ff:ff:ff
I am not sure sure this is really useful however as it seems it only really works with Ethernet interfaces.
We can set the hardware address (for Ethernet interfaces, the MAC address) with SIOCSIFHWADDR
(see man netdevice
):
struct ifreq ifr;
ifr.ifr_hwaddr.sa_family = ARPHRD_ETHER;
// We need a placeholder socket for this:
inc sock = socket(PF_INET, SOCK_STREAM, 0);
strcpy(&ifr.ifr_name, iface_name);
memcpy(&ifr.ifr_hwaddr.sa_data, hwaddr, hwaddr_size);
ioctl(skfd, SIOCSIFHWADDR, &ifr);
Advanced considerations
Packet Information
If IFF_NO_PI
is not used, each packet read to or written from the file descriptor is prepended with 4 bytes:
struct tun_pi {
__u16 flags;
__be16 proto;
};
The proto
field is an EtherType. This it not very useful for TAP interface as far as I known. For TUN interfaces, this is usually ETH_P_IP
or ETH_P_IPV6
. If IFF_NO_PI
is used, the IP version of packets sent by the process are derived from the first byte of the packet (for TUN interfaces).
Currently, the only flag defined is TUN_PKT_STRIP
which is set by the kernel to signal the userspace program that the packet was truncated because the buffer was too small.
Persistent
The TUNSETPERSIST
ioctl can be used to make the TUN/TAP interface persistent. In this mode, the interface won't be destroyed when the last process closes the associated /dev/net/tun
file descriptor.
ioctl(tap_fd, TUNSETPERSIST, 1);
There is a race condition when setting TUNSETPERSIST
. Instead, the IFF_TUN_EXCL
can be used to ensure we have a new interface. If this flag is used and the interface already exists, we get a EBUSY
error.
It is possible to create a persistent TUN/TAP interface using the ip tuntap
command:
sudo ip tuntap add dev foo0 mode tun
User and group
It is possible to assign a persistent interface to a given user in order to five a non-root user access to a TUN/TAP interface:
ioctl(tap_fd, TUNSETPERSIST, 1);
ioctl(tap_fd, TUNSETOWNER, owner);
Or to a whole groupe:
ioctl(tap_fd, TUNSETPERSIST, 1);
ioctl(tap_fd, TUNSETGROUP, group);
We can use ip tuntap
for this as well:
sudo ip tuntap add dev foo0 mode tun user john
sudo ip tuntap add dev foo1 mode tun group doe
Examples
TUN Reader
This simple program creates a TUN interface. Every packet sent to this interface is printed to the standard output (stderr
) after parsing some fields:
Includes
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <netdb.h>
#include <netinet/in.h> // IPPROTO_*
#include <net/if.h> // ifreq
#include <linux/if_tun.h> // IFF_TUN, IFF_NO_PI
#include <sys/ioctl.h>
Utilities
#define BUFFLEN (4 * 1024)
const char HEX[] = {
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'a', 'b', 'c', 'd', 'e', 'f',
};
void hex(char* source, char* dest, ssize_t count)
{
for (ssize_t i = 0; i < count; ++i) {
unsigned char data = source[i];
dest[2 * i] = HEX[data >> 4];
dest[2 * i + 1] = HEX[data & 15];
}
dest[2 * count] = '\0';
}
Dumping packets
int has_ports(int protocol)
{
switch(protocol) {
case IPPROTO_UDP:
case IPPROTO_UDPLITE:
case IPPROTO_TCP:
return 1;
default:
return 0;
}
}
void dump_ports(int protocol, int count, const char* buffer)
{
if (!has_ports(protocol))
return;
if (count < 4)
return;
uint16_t source_port;
uint16_t dest_port;
memcpy(&source_port, buffer, 2);
source_port = htons(source_port);
memcpy(&dest_port, buffer + 2, 2);
dest_port = htons(dest_port);
fprintf(stderr, " sport=%u, dport=%d\n", (unsigned) source_port, (unsigned) dest_port);
}
void dump_packet_ipv4(int count, char* buffer)
{
if (count < 20) {
fprintf(stderr, "IPv4 packet too short\n");
return;
}
char buffer2[2*BUFFLEN + 1];
hex(buffer, buffer2, count);
int protocol = (unsigned char) buffer[9];
struct protoent* protocol_entry = getprotobynumber(protocol);
unsigned ttl = (unsigned char) buffer[8];
fprintf(stderr, "IPv4: src=%u.%u.%u.%u dst=%u.%u.%u.%u proto=%u(%s) ttl=%u\n",
(unsigned char) buffer[12], (unsigned char) buffer[13], (unsigned char) buffer[14], (unsigned char) buffer[15],
(unsigned char) buffer[16], (unsigned char) buffer[17], (unsigned char) buffer[18], (unsigned char) buffer[19],
(unsigned) protocol,
protocol_entry == NULL ? "?" : protocol_entry->p_name, ttl
);
dump_ports(protocol, count - 20, buffer + 20);
fprintf(stderr, " HEX: %s\n", buffer2);
}
void dump_packet_ipv6(int count, char* buffer)
{
if (count < 40) {
fprintf(stderr, "IPv6 packet too short\n");
return;
}
char buffer2[2*BUFFLEN + 1];
hex(buffer, buffer2, count);
int protocol = (unsigned char) buffer[6];
struct protoent* protocol_entry = getprotobynumber(protocol);
char source_address[33];
char destination_address[33];
hex(buffer + 8, source_address, 16);
hex(buffer + 24, destination_address, 16);
int hop_limit = (unsigned char) buffer[7];
fprintf(stderr, "IPv6: src=%s dst=%s proto=%u(%s) hop_limit=%i\n",
source_address, destination_address,
(unsigned) protocol,
protocol_entry == NULL ? "?" : protocol_entry->p_name,
hop_limit);
dump_ports(protocol, count - 40, buffer + 40);
fprintf(stderr, " HEX: %s\n", buffer2);
}
void dump_packet(int count, char* buffer)
{
unsigned char version = ((unsigned char) buffer[0]) >> 4;
if (version == 4) {
dump_packet_ipv4(count, buffer);
} else if (version == 6) {
dump_packet_ipv6(count, buffer);
} else {
fprintf(stderr, "Unknown packet version\n");
}
}
int main(int argc, char** argv)
{
if (argc != 2)
return 1;
const char* device_name = argv[1];
if (strlen(device_name) + 1 > IFNAMSIZ)
return 1;
// Request a TUN device:
int fd = open("/dev/net/tun", O_RDWR);
if (fd == -1)
return 1;
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TUN | IFF_NO_PI;
strncpy(ifr.ifr_name, device_name, IFNAMSIZ);
int res = ioctl(fd, TUNSETIFF, &ifr);
if (res == -1)
return 1;
char buffer[BUFFLEN];
while (1) {
// Read an IP packet:
ssize_t count = read(fd, buffer, BUFFLEN);
if (count < 0)
return 1;
dump_packet(count, buffer);
}
return 0;
}
Notice how read()
on the file descriptor is used by the userspace program to get a packet sent by the network stack to the netork interface. If we wanted packets to come out the interface, we would have to write()
it to the file descriptor.
The program needs to be run as root
:
sudo ./tun_reader tun0
The interface is initially down and without a IPv4 address:
8: tun0: <POINTOPOINT,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 500
link/none
Now we need to configure an IP address and bring the interface up before we can use it:
sudo ip addr add 203.0.113.1/24 dev tun0 &&
sudo ip link set up dev tun0
8: tun0:mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500 link/none inet 203.0.113.1/24 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::807f:6e64:ba6e:ee26/64 scope link stable-privacy valid_lft forever preferred_lft forever
Alternatively, we could have configured the interface from the program:
- configure the IP address(es) of the interface using
RTM_NEWADDR
; - set the interface up with the
SIOCSIFFLAGS
ioctl
withIFF_UP
.
We get this kind of output:
IPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000000002 proto=58(ipv6-icmp) hop_limit=255 HEX: 6000000000083afffe80000000000000807f6e64ba6eee26ff0200000000000000000000000000028500e5bd00000000 IPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000000016 proto=0(ip) hop_limit=1 HEX: 6000000000240001fe80000000000000807f6e64ba6eee26ff0200000000000000000000000000163a000502000001008f00d88d0000000104000000ff020000000000000000000000010003 IPv4: src=203.0.113.1 dst=224.0.0.22 proto=2(igmp) ttl=1 HEX: 46c00028000040000102c7f7cb007101e0000016940400002200f9010000000104000000e00000fc IPv4: src=203.0.113.1 dst=224.0.0.252 proto=17(udp) ttl=255 sport=5355, dport=5355 HEX: 45000034e8e40000ff11b5d5cb007101e00000fc14eb14eb00205ec50cca00000001000000000000066d617276696e0000ff0001 IPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000010003 proto=17(udp) hop_limit=255 sport=5355, dport=5355 HEX: 600516ee001f11fffe80000000000000807f6e64ba6eee26ff02000000000000000000000001000314eb14eb001f75abedf000000001000000000000056c6f63616c0000060001 IPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000010003 proto=17(udp) hop_limit=255 sport=5355, dport=5355 HEX: 600516ee002011fffe80000000000000807f6e64ba6eee26ff02000000000000000000000001000314eb14eb00204b0fa87d00000001000000000000066d617276696e0000ff0001 IPv4: src=203.0.113.1 dst=224.0.0.252 proto=17(udp) ttl=255 sport=5355, dport=5355 HEX: 45000033e8e50000ff11b5d5cb007101e00000fc14eb14eb001f1237c96700000001000000000000056c6f63616c0000060001
We easily recognize from the hex-dump that there are IP packet: they start with the IP version (0x6
or 0x4
).
Dissecting with ScaPy
For more details, we can dissect IP packets from Python using ScaPy:
from scapy.all import *
IP(bytes.fromhex("45000033e8e50000ff11b5d5cb007101e00000fc14eb14eb001f1237c96700000001000000000000056c6f63616c0000060001")).display()
###[ IP ]### version = 4 ihl = 5 tos = 0x0 len = 51 id = 59621 flags = frag = 0 ttl = 255 proto = udp chksum = 0xb5d5 src = 203.0.113.1 dst = 224.0.0.252 \options \ ###[ UDP ]### sport = hostmon dport = hostmon len = 31 chksum = 0x1237 ###[ Link Local Multicast Node Resolution - Query ]### id = 51559 qr = 0 opcode = QUERY c = 0 tc = 0 z = 0 rcode = ok qdcount = 1 ancount = 0 nscount = 0 arcount = 0 \qd \ |###[ DNS Question Record ]### | qname = 'local.' | qtype = SOA | qclass = IN an = None ns = None ar = None
Now we can try to ping an addres on the link:
ping 203.0.113.2
Which gives the packets:
IPv4: src=203.0.113.1 dst=203.0.113.2 proto=icmp ttl=64 HEX: 45000054ceab40004001f3f8cb007101cb00710208008d0b23420001f6ab9e5e00000000edd3060000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637 IPv4: src=203.0.113.1 dst=203.0.113.2 proto=icmp ttl=64 HEX: 45000054cf8340004001f320cb007101cb00710208007f9823420002f7ab9e5e00000000f945070000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637
Dissecting with ScaPy
The first packet is dissected by ScaPy as:
###[ IP ]### version = 4 ihl = 5 tos = 0x0 len = 84 id = 52907 flags = DF frag = 0 ttl = 64 proto = icmp chksum = 0xf3f8 src = 203.0.113.1 dst = 203.0.113.2 \options \ ###[ ICMP ]### type = echo-request code = 0 chksum = 0x8d0b id = 0x2342 seq = 0x1 ###[ Raw ]### load = '\xf6\xab\x9e^\x00\x00\x00\x00\xed\xd3\x06\x00\x00\x00\x00\x00\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f !"#$%&\'()*+,-./01234567'
TAP Reader
Here is a simple program which create a TAP interface and dumps the Ethernet frames it receives from the kernel in hexadecimal form:
Includes
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <stdio.h>
#include <netdb.h>
#include <net/if.h> // ifreq
#include <linux/if_tun.h> // IFF_TUN, IFF_NO_PI
#include <linux/if_arp.h>
#include <sys/ioctl.h>
Utilities
#define BUFFLEN (4 * 1024)
const char HEX[] = {
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
'a', 'b', 'c', 'd', 'e', 'f',
};
void hex(char* source, char* dest, ssize_t count)
{
for (ssize_t i = 0; i < count; ++i) {
unsigned char data = source[i];
dest[2 * i] = HEX[data >> 4];
dest[2 * i + 1] = HEX[data & 15];
}
dest[2 * count] = '\0';
}
int main(int argc, char** argv)
{
if (argc != 2)
return 1;
const char* device_name = argv[1];
if (strlen(device_name) + 1 > IFNAMSIZ)
return 1;
// Request a TAP device:
int fd = open("/dev/net/tun", O_RDWR);
if (fd == -1)
return 1;
struct ifreq ifr;
memset(&ifr, 0, sizeof(ifr));
ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
strncpy(ifr.ifr_name, device_name, IFNAMSIZ);
int res = ioctl(fd, TUNSETIFF, &ifr);
if (res == -1)
return 1;
char buffer[BUFFLEN];
char buffer2[2*BUFFLEN + 1];
while (1) {
// Read a frame:
ssize_t count = read(fd, buffer, BUFFLEN);
if (count < 0)
return 1;
// Dump frame:
hex(buffer, buffer2, count);
fprintf(stderr, "%s\n", buffer2);
}
return 0;
}
Let's run it:
sudo ./rap_reader tap0
We now have a TAP interface:
9: tap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 7a:40:61:dc:b9:5b brd ff:ff:ff:ff:ff:ff
We can notice several differences compared to the TUN example:
- this interface has a
BROADCAST
flag instead ofPOINTOPOINT
; - we don't have a
NOARP
interface anymore (we need to find L2 addresses when talking on interfaces on this network); - we now have
link/ether
line with an associated MAC address.
Let's bring the interface up:
sudo ip addr add 203.0.113.1/24 dev tap0
sudo ip link set up dev tap0
9: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
link/ether 7a:40:61:dc:b9:5b brd ff:ff:ff:ff:ff:ff
inet 203.0.113.1/24 scope global tap0
valid_lft forever preferred_lft forever
inet6 fe80::7840:61ff:fedc:b95b/64 scope link tentative
valid_lft forever preferred_lft forever
Output:
01005e0000167a4061dcb95b080046c00030000040000102c7efcb007101e000001694040000220014050000000204000000e00000fc04000000e00000fb 3333000000167a4061dcb95b86dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00b5520000000104000000ff0200000000000000000001ffdcb95b 01005e0000fb7a4061dcb95b08004500004996344000ff11c871cb007101e00000fb14e914e90035b739000000000002000000000000055f69707073045f746370056c6f63616c00000c0001045f697070c012000c0001 01005e0000fb7a4061dcb95b08004500007696354000ff11c843cb007101e00000fb14e914e90062336400000000000200000002000001310331313301300332303307696e2d6164647204617270610000ff0001066d617276696e056c6f63616c0000ff0001c02a00010001000000780004cb007101c00c000c0001000000780002c02a
We have now Ethernet frames instead of raw IP packets coming in and out of the virtual interface:
- we find destination MAC addresses such as
01:00:5e:00:00:16
, multicast MAC address for multicast IP address224.0.0.22
(IGMP)01:00:5e:00:00:fb
, multicast MAC address for multicast IP address224.0.0.251
(mDNS)
- we find the source MAC address of the TAP interface (in this example,
7a:40:61:dc:b9:5b
).
Disecting with ScaPy
We can disect Ethernet frames with ScaPy:
from scapy.all import *
Ether(bytes.fromhex("01005e0000167a4061dcb95b080046c00030000040000102c7efcb007101e000001694040000220014050000000204000000e00000fc04000000e00000fb")).displa
###[ Ethernet ]### dst = 01:00:5e:00:00:16 src = 7a:40:61:dc:b9:5b type = IPv4 ###[ IP ]### version = 4 ihl = 6 tos = 0xc0 len = 48 id = 0 flags = DF frag = 0 ttl = 1 proto = igmp chksum = 0xc7ef src = 203.0.113.1 dst = 224.0.0.22 \options \ |###[ IP Option Router Alert ]### | copy_flag = 1 | optclass = control | option = router_alert | length = 4 | alert = router_shall_examine_packet ###[ Raw ]### load = '"\x00\x14\x05\x00\x00\x00\x02\x04\x00\x00\x00\xe0\x00\x00\xfc\x04\x00\x00\x00\xe0\x00\x00\xfb'
Now we can try to ping an address on the link:
ping 203.0.113.2
Which gives:
ffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102 ffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102 ffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102
Disecting with ScaPy
The frames are decoded by ScaPy as:
###[ Ethernet ]### dst = ff:ff:ff:ff:ff:ff src = 7a:40:61:dc:b9:5b type = ARP ###[ ARP ]### hwtype = 0x1 ptype = IPv4 hwlen = 6 plen = 4 op = who-has hwsrc = 7a:40:61:dc:b9:5b psrc = 203.0.113.1 hwdst = 00:00:00:00:00:00 pdst = 203.0.113.2
References
- Universal TUN/TAP device driver
- Tun/Tap interface tutorial
- Linux Tun/Tap ioctl code
- MacVTap
man netdevice
- Virtualbox
tunctl.c
- FreeBSD man tun(4)
- OpenBSD man tun(4)
However, not all VPN software are based on TUN/TAP. For example, WireGuard is implemented in kernel and has its dedicated network interface type. Many tunnels based on PPP use
pppd
which usepppX
network interfaces. These are managed through/dev/ppp
and related ioctls). ↩︎When using an Ethernet or Wifi interface, several different machines can possibly be reached directly through this interface: as a consequence, L2 addressing is needed which specify to which host on the LAN we are sending a given frame. In contrast, only a single machine is directly reachable through
POINTOPOINT
network interfaces and there is thus no L2 addressing. TUN interfaces only transport L3 (IP) packets: they do not have L2 addressing and are thusPOINTOPOINT
interfaces. ↩︎