{"version": "https://jsonfeed.org/version/1", "title": "/dev/posts/ - Tag index - vpn", "home_page_url": "https://www.gabriel.urdhr.fr", "feed_url": "/tags/vpn/feed.json", "items": [{"id": "http://www.gabriel.urdhr.fr/2021/05/08/tuntap/", "title": "TUN/TAP interface (on Linux)", "url": "https://www.gabriel.urdhr.fr/2021/05/08/tuntap/", "date_published": "2021-05-08T00:00:00+02:00", "date_modified": "2021-05-08T00:00:00+02:00", "tags": ["computer", "system", "network", "tun", "tap", "linux", "vpn"], "content_html": "
Some notes about using the TUN/TAP interface, especially on Linux.
\nTUN/TAP is an operating-system interface\nfor creating network interfaces managed by userspace.\nThis is usually used\nto implement userspace Virtual Private Networks[1] (VPNs),\nfor example with OpenVPN,\nOpenSSH (Tunnel
configuration or -w
argument),\nl2tpns, etc.\nThis interface is exposed through the /dev/net/tun
device file\nand related ioctls\n(TUNSETIFF
, etc.).
This C code snippet creates a virtual network interface tun0
\ncontrolled by the program:
int fd = open(\"/dev/net/tun\", O_RDWR);\nstruct ifreq ifr;\nmemset(&ifr, 0, sizeof(ifr));\nifr.ifr_flags = IFF_TUN | IFF_NO_PI;\nstrncpy(ifr.ifr_name, \"tun0\", IFNAMSIZ);\nioctl(fd, TUNSETIFF, &ifr);\n
\nNote: I've omitted error handling for brevity\nin the code snippets. In real programs, you should check\nthe result of each operation.
\nThe program first opens the /dev/net/tun
file:\neach file descriptor obtained this way can be used\nby the process to communicate with one virtual network device.
Then, we use the TUNSETIFF
ioctl()
in order to configure the network\ninterface associated with this file descriptor:
ifr_flags
field to choose whether to create\na TUN (i.e. IP) or a TAP (i.e. Ethernet) interface\n(explained in the next section);ifr_flags
field to set additional options as well\n(such as the IFF_NO_PI
flag, explained afterwards);ifr_name
field is not empty, the call either create a new\nvirtual network interface with the chosen name or an existing one\n(depending on whether a network interface with this name already exists);ifr_name
fields is empty, the system creates a new interface,\ngenerated a name for it and stored this name in the ifr.ifr_name
field.After this, the program can communicate with the virtual network interface\nusing the file descriptor:
\nwrite()
sends a (single) packet or frame to the virtual network interface;read()
receives a (single) packet or frame from the virtual network interface;select()
works as expected.When the the program closes the last file descriptor associated with the\nTUN/TAP interface, the systems destroys the interface.\nWe can prevent the system from destroying the interface by\nmaking it made persistent with the TUNSETPERSIST
ioctl()
.
There are two types of virtual network interfaces managed by /dev/net/tun
:
TUN interfaces (IFF_TUN
) transport layer 3 (L3) Protocol Data Units (PDUs):
read()
gets a L3 PDU (an IP packet);write()
L3 PDUs (an IP packet);POINTOPOINT
interfaces[2].If we have set IFF_NO_PI
in the ifr_flags
field,\nthe version of the IP protocol (IPv4 or IPv6) is deduced\nfrom the IP version number in the packet.\nIf IFF_NO_PI
is not set, 4 bytes (struct tun_pi
) are prepended to\neach packet (see below for details).
TAP interfaces (IFF_TUN
) transport layer 2 (L2) PDUs:
read()
gets a L2 PDU;write()
L2 PDUs.By default, the TAP interface is an Ethernet interfaces\nbut we ca n change the type of device\nwith the TUNSETLINK
ioctl()
.\nFor example, for we can request a virtual ATM network interface with:
ioctl(fd, TUNSETLINK, ARPHRD_ATM);\n
\nwhich gives:
\n\n5: tap0:\nmtu 1500 qdisc noop state DOWN group default qlen 500\n link/atm 12:07:d9:23:19:7d brd ff:ff:ff:ff:ff:ff\n
I am not sure sure this is really useful however\nas it seems it only really works with Ethernet interfaces.
\nWe can set the hardware address (for Ethernet interfaces, the MAC address)\nwith SIOCSIFHWADDR
(see man netdevice
):
struct ifreq ifr;\nifr.ifr_hwaddr.sa_family = ARPHRD_ETHER;\n// We need a placeholder socket for this:\ninc sock = socket(PF_INET, SOCK_STREAM, 0);\nstrcpy(&ifr.ifr_name, iface_name);\nmemcpy(&ifr.ifr_hwaddr.sa_data, hwaddr, hwaddr_size);\nioctl(skfd, SIOCSIFHWADDR, &ifr);\n
\nIf IFF_NO_PI
is not used, each packet\nread to or written from the file descriptor\nis prepended with 4 bytes:
struct tun_pi {\n __u16 flags;\n __be16 proto;\n};\n
\nThe proto
field is an EtherType.\nThis it not very useful for TAP interface as far as I known.\nFor TUN interfaces, this is usually ETH_P_IP
or ETH_P_IPV6
.\nIf IFF_NO_PI
is used, the IP version of packets sent by the process are derived\nfrom the first byte of the packet (for TUN interfaces).
Currently, the only flag defined is TUN_PKT_STRIP
which is set by\nthe kernel to signal the userspace program that the packet was\ntruncated because the buffer was too small.
The TUNSETPERSIST
ioctl can be used to make the TUN/TAP interface persistent.\nIn this mode, the interface won't be destroyed when the last process\ncloses the associated /dev/net/tun
file descriptor.
ioctl(tap_fd, TUNSETPERSIST, 1);\n
\nThere is a race condition when\nsetting TUNSETPERSIST
. Instead, the IFF_TUN_EXCL
can be used to\nensure we have a new interface.\nIf this flag is used and the interface already exists, we get a EBUSY
error.
It is possible to create a persistent TUN/TAP interface\nusing the ip tuntap
command:
sudo ip tuntap add dev foo0 mode tun\n
\nIt is possible to assign a persistent interface to a given user in\norder to five a non-root user access to a TUN/TAP interface:
\nioctl(tap_fd, TUNSETPERSIST, 1);\nioctl(tap_fd, TUNSETOWNER, owner);\n
\nOr to a whole groupe:
\nioctl(tap_fd, TUNSETPERSIST, 1);\nioctl(tap_fd, TUNSETGROUP, group);\n
\nWe can use ip tuntap
for this as well:
sudo ip tuntap add dev foo0 mode tun user john\nsudo ip tuntap add dev foo1 mode tun group doe\n
\nThis simple program creates a TUN interface.\nEvery packet sent to this interface is printed\nto the standard output (stderr
)\nafter parsing some fields:
#include <sys/types.h>\n#include <sys/stat.h>\n#include <fcntl.h>\n#include <string.h>\n#include <unistd.h>\n#include <stdio.h>\n#include <netdb.h>\n\n#include <netinet/in.h> // IPPROTO_*\n#include <net/if.h> // ifreq\n#include <linux/if_tun.h> // IFF_TUN, IFF_NO_PI\n\n#include <sys/ioctl.h>\n
\n#define BUFFLEN (4 * 1024)\n\nconst char HEX[] = {\n '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',\n 'a', 'b', 'c', 'd', 'e', 'f',\n};\n\nvoid hex(char* source, char* dest, ssize_t count)\n{\n for (ssize_t i = 0; i < count; ++i) {\n unsigned char data = source[i];\n dest[2 * i] = HEX[data >> 4];\n dest[2 * i + 1] = HEX[data & 15];\n }\n dest[2 * count] = '\\0';\n}\n
\nint has_ports(int protocol)\n{\n switch(protocol) {\n case IPPROTO_UDP:\n case IPPROTO_UDPLITE:\n case IPPROTO_TCP:\n return 1;\n default:\n return 0;\n }\n}\n\nvoid dump_ports(int protocol, int count, const char* buffer)\n{\n if (!has_ports(protocol))\n return;\n if (count < 4)\n return;\n uint16_t source_port;\n uint16_t dest_port;\n memcpy(&source_port, buffer, 2);\n source_port = htons(source_port);\n memcpy(&dest_port, buffer + 2, 2);\n dest_port = htons(dest_port);\n fprintf(stderr, \" sport=%u, dport=%d\\n\", (unsigned) source_port, (unsigned) dest_port);\n}\n\nvoid dump_packet_ipv4(int count, char* buffer)\n{\n if (count < 20) {\n fprintf(stderr, \"IPv4 packet too short\\n\");\n return;\n }\n\n char buffer2[2*BUFFLEN + 1];\n hex(buffer, buffer2, count);\n\n int protocol = (unsigned char) buffer[9];\n struct protoent* protocol_entry = getprotobynumber(protocol);\n\n unsigned ttl = (unsigned char) buffer[8];\n\n fprintf(stderr, \"IPv4: src=%u.%u.%u.%u dst=%u.%u.%u.%u proto=%u(%s) ttl=%u\\n\",\n (unsigned char) buffer[12], (unsigned char) buffer[13], (unsigned char) buffer[14], (unsigned char) buffer[15],\n (unsigned char) buffer[16], (unsigned char) buffer[17], (unsigned char) buffer[18], (unsigned char) buffer[19],\n (unsigned) protocol,\n protocol_entry == NULL ? \"?\" : protocol_entry->p_name, ttl\n );\n dump_ports(protocol, count - 20, buffer + 20);\n fprintf(stderr, \" HEX: %s\\n\", buffer2);\n}\n\nvoid dump_packet_ipv6(int count, char* buffer)\n{\n if (count < 40) {\n fprintf(stderr, \"IPv6 packet too short\\n\");\n return;\n }\n\n char buffer2[2*BUFFLEN + 1];\n hex(buffer, buffer2, count);\n\n int protocol = (unsigned char) buffer[6];\n struct protoent* protocol_entry = getprotobynumber(protocol);\n\n char source_address[33];\n char destination_address[33];\n\n hex(buffer + 8, source_address, 16);\n hex(buffer + 24, destination_address, 16);\n\n int hop_limit = (unsigned char) buffer[7];\n\n fprintf(stderr, \"IPv6: src=%s dst=%s proto=%u(%s) hop_limit=%i\\n\",\n source_address, destination_address,\n (unsigned) protocol,\n protocol_entry == NULL ? \"?\" : protocol_entry->p_name,\n hop_limit);\n dump_ports(protocol, count - 40, buffer + 40);\n fprintf(stderr, \" HEX: %s\\n\", buffer2);\n}\n\nvoid dump_packet(int count, char* buffer)\n{\n unsigned char version = ((unsigned char) buffer[0]) >> 4;\n if (version == 4) {\n dump_packet_ipv4(count, buffer);\n } else if (version == 6) {\n dump_packet_ipv6(count, buffer); \n } else {\n fprintf(stderr, \"Unknown packet version\\n\");\n }\n}\n
\nint main(int argc, char** argv)\n{\n if (argc != 2)\n return 1;\n const char* device_name = argv[1];\n if (strlen(device_name) + 1 > IFNAMSIZ)\n return 1;\n\n // Request a TUN device:\n int fd = open(\"/dev/net/tun\", O_RDWR);\n if (fd == -1)\n return 1;\n struct ifreq ifr;\n memset(&ifr, 0, sizeof(ifr));\n ifr.ifr_flags = IFF_TUN | IFF_NO_PI;\n strncpy(ifr.ifr_name, device_name, IFNAMSIZ);\n int res = ioctl(fd, TUNSETIFF, &ifr);\n if (res == -1)\n return 1;\n\n char buffer[BUFFLEN];\n while (1) {\n // Read an IP packet:\n ssize_t count = read(fd, buffer, BUFFLEN);\n if (count < 0)\n return 1;\n dump_packet(count, buffer);\n }\n\n return 0;\n}\n
\nNotice how read()
on the file descriptor is used by the userspace program to get a packet\nsent by the network stack to the netork interface.\nIf we wanted packets to come out the interface,\nwe would have to write()
it to the file descriptor.
The program needs to be run as root
:
sudo ./tun_reader tun0\n
\nThe interface is initially down and without a IPv4 address:
\n8: tun0: <POINTOPOINT,MULTICAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 500\n link/none\n
\nNow we need to configure an IP address and bring the interface up before\nwe can use it:
\nsudo ip addr add 203.0.113.1/24 dev tun0 &&\nsudo ip link set up dev tun0\n
\n\n8: tun0:\nmtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500\n link/none \n inet 203.0.113.1/24 scope global tun0\n valid_lft forever preferred_lft forever\n inet6 fe80::807f:6e64:ba6e:ee26/64 scope link stable-privacy \n valid_lft forever preferred_lft forever\n
Alternatively, we could have configured the interface from the program:
\nRTM_NEWADDR
;SIOCSIFFLAGS
ioctl
with IFF_UP
.We get this kind of output:
\n\nIPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000000002 proto=58(ipv6-icmp) hop_limit=255\n HEX: 6000000000083afffe80000000000000807f6e64ba6eee26ff0200000000000000000000000000028500e5bd00000000\nIPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000000016 proto=0(ip) hop_limit=1\n HEX: 6000000000240001fe80000000000000807f6e64ba6eee26ff0200000000000000000000000000163a000502000001008f00d88d0000000104000000ff020000000000000000000000010003\nIPv4: src=203.0.113.1 dst=224.0.0.22 proto=2(igmp) ttl=1\n HEX: 46c00028000040000102c7f7cb007101e0000016940400002200f9010000000104000000e00000fc\nIPv4: src=203.0.113.1 dst=224.0.0.252 proto=17(udp) ttl=255\n sport=5355, dport=5355\n HEX: 45000034e8e40000ff11b5d5cb007101e00000fc14eb14eb00205ec50cca00000001000000000000066d617276696e0000ff0001\nIPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000010003 proto=17(udp) hop_limit=255\n sport=5355, dport=5355\n HEX: 600516ee001f11fffe80000000000000807f6e64ba6eee26ff02000000000000000000000001000314eb14eb001f75abedf000000001000000000000056c6f63616c0000060001\nIPv6: src=fe80000000000000807f6e64ba6eee26 dst=ff020000000000000000000000010003 proto=17(udp) hop_limit=255\n sport=5355, dport=5355\n HEX: 600516ee002011fffe80000000000000807f6e64ba6eee26ff02000000000000000000000001000314eb14eb00204b0fa87d00000001000000000000066d617276696e0000ff0001\nIPv4: src=203.0.113.1 dst=224.0.0.252 proto=17(udp) ttl=255\n sport=5355, dport=5355\n HEX: 45000033e8e50000ff11b5d5cb007101e00000fc14eb14eb001f1237c96700000001000000000000056c6f63616c0000060001\n\n
We easily recognize from the hex-dump that there are IP packet:\nthey start with the IP version (0x6
or 0x4
).
For more details, we can dissect IP packets from Python using ScaPy:
\nfrom scapy.all import *\nIP(bytes.fromhex(\"45000033e8e50000ff11b5d5cb007101e00000fc14eb14eb001f1237c96700000001000000000000056c6f63616c0000060001\")).display()\n
\n\n###[ IP ]### \n version = 4\n ihl = 5\n tos = 0x0\n len = 51\n id = 59621\n flags = \n frag = 0\n ttl = 255\n proto = udp\n chksum = 0xb5d5\n src = 203.0.113.1\n dst = 224.0.0.252\n \\options \\\n###[ UDP ]### \n sport = hostmon\n dport = hostmon\n len = 31\n chksum = 0x1237\n###[ Link Local Multicast Node Resolution - Query ]### \n id = 51559\n qr = 0\n opcode = QUERY\n c = 0\n tc = 0\n z = 0\n rcode = ok\n qdcount = 1\n ancount = 0\n nscount = 0\n arcount = 0\n \\qd \\\n |###[ DNS Question Record ]### \n | qname = 'local.'\n | qtype = SOA\n | qclass = IN\n an = None\n ns = None\n ar = None\n\n
Now we can try to ping an addres on the link:
\nping 203.0.113.2\n
\nWhich gives the packets:
\n\nIPv4: src=203.0.113.1 dst=203.0.113.2 proto=icmp ttl=64\n HEX: 45000054ceab40004001f3f8cb007101cb00710208008d0b23420001f6ab9e5e00000000edd3060000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637\nIPv4: src=203.0.113.1 dst=203.0.113.2 proto=icmp ttl=64\n HEX: 45000054cf8340004001f320cb007101cb00710208007f9823420002f7ab9e5e00000000f945070000000000101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f3031323334353637\n\n
The first packet is dissected by ScaPy as:
\n\n###[ IP ]### \n version = 4\n ihl = 5\n tos = 0x0\n len = 84\n id = 52907\n flags = DF\n frag = 0\n ttl = 64\n proto = icmp\n chksum = 0xf3f8\n src = 203.0.113.1\n dst = 203.0.113.2\n \\options \\\n###[ ICMP ]### \n type = echo-request\n code = 0\n chksum = 0x8d0b\n id = 0x2342\n seq = 0x1\n###[ Raw ]### \n load = '\\xf6\\xab\\x9e^\\x00\\x00\\x00\\x00\\xed\\xd3\\x06\\x00\\x00\\x00\\x00\\x00\\x10\\x11\\x12\\x13\\x14\\x15\\x16\\x17\\x18\\x19\\x1a\\x1b\\x1c\\x1d\\x1e\\x1f !\"#$%&\\'()*+,-./01234567'\n\n
Here is a simple program which create a TAP interface and dumps\nthe Ethernet frames it receives from the kernel in hexadecimal form:
\n#include <sys/types.h>\n#include <sys/stat.h>\n#include <fcntl.h>\n#include <string.h>\n#include <unistd.h>\n#include <stdio.h>\n#include <netdb.h>\n\n#include <net/if.h> // ifreq\n#include <linux/if_tun.h> // IFF_TUN, IFF_NO_PI\n#include <linux/if_arp.h>\n\n#include <sys/ioctl.h>\n
\n#define BUFFLEN (4 * 1024)\n\nconst char HEX[] = {\n '0', '1', '2', '3', '4', '5', '6', '7', '8', '9',\n 'a', 'b', 'c', 'd', 'e', 'f',\n};\n\nvoid hex(char* source, char* dest, ssize_t count)\n{\n for (ssize_t i = 0; i < count; ++i) {\n unsigned char data = source[i];\n dest[2 * i] = HEX[data >> 4];\n dest[2 * i + 1] = HEX[data & 15];\n }\n dest[2 * count] = '\\0';\n}\n
\nint main(int argc, char** argv)\n{\n if (argc != 2)\n return 1;\n const char* device_name = argv[1];\n if (strlen(device_name) + 1 > IFNAMSIZ)\n return 1;\n\n // Request a TAP device:\n int fd = open(\"/dev/net/tun\", O_RDWR);\n if (fd == -1)\n return 1;\n struct ifreq ifr;\n memset(&ifr, 0, sizeof(ifr));\n ifr.ifr_flags = IFF_TAP | IFF_NO_PI;\n strncpy(ifr.ifr_name, device_name, IFNAMSIZ);\n int res = ioctl(fd, TUNSETIFF, &ifr);\n if (res == -1)\n return 1;\n\n char buffer[BUFFLEN];\n char buffer2[2*BUFFLEN + 1];\n while (1) {\n\n // Read a frame:\n ssize_t count = read(fd, buffer, BUFFLEN);\n if (count < 0)\n return 1;\n\n // Dump frame:\n hex(buffer, buffer2, count);\n fprintf(stderr, \"%s\\n\", buffer2);\n }\n\n return 0;\n}\n
\nLet's run it:
\nsudo ./rap_reader tap0\n
\nWe now have a TAP interface:
\n9: tap0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000\n link/ether 7a:40:61:dc:b9:5b brd ff:ff:ff:ff:ff:ff\n
\nWe can notice several differences compared to the TUN example:
\nBROADCAST
flag instead of POINTOPOINT
;NOARP
interface anymore (we need to find L2 addresses when talking on interfaces on this network);link/ether
line with an associated MAC address.Let's bring the interface up:
\nsudo ip addr add 203.0.113.1/24 dev tap0\nsudo ip link set up dev tap0\n
\n9: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000\n link/ether 7a:40:61:dc:b9:5b brd ff:ff:ff:ff:ff:ff\n inet 203.0.113.1/24 scope global tap0\n valid_lft forever preferred_lft forever\n inet6 fe80::7840:61ff:fedc:b95b/64 scope link tentative \n valid_lft forever preferred_lft forever\n
\nOutput:
\n\n01005e0000167a4061dcb95b080046c00030000040000102c7efcb007101e000001694040000220014050000000204000000e00000fc04000000e00000fb\n3333000000167a4061dcb95b86dd600000000024000100000000000000000000000000000000ff0200000000000000000000000000163a000502000001008f00b5520000000104000000ff0200000000000000000001ffdcb95b\n01005e0000fb7a4061dcb95b08004500004996344000ff11c871cb007101e00000fb14e914e90035b739000000000002000000000000055f69707073045f746370056c6f63616c00000c0001045f697070c012000c0001\n01005e0000fb7a4061dcb95b08004500007696354000ff11c843cb007101e00000fb14e914e90062336400000000000200000002000001310331313301300332303307696e2d6164647204617270610000ff0001066d617276696e056c6f63616c0000ff0001c02a00010001000000780004cb007101c00c000c0001000000780002c02a\n\n
We have now Ethernet frames instead of raw IP packets coming in and out\nof the virtual interface:
\n01:00:5e:00:00:16
, multicast MAC address for multicast IP address 224.0.0.22
(IGMP)01:00:5e:00:00:fb
, multicast MAC address for multicast IP address 224.0.0.251
(mDNS)7a:40:61:dc:b9:5b
).We can disect Ethernet frames with ScaPy:
\nfrom scapy.all import *\nEther(bytes.fromhex(\"01005e0000167a4061dcb95b080046c00030000040000102c7efcb007101e000001694040000220014050000000204000000e00000fc04000000e00000fb\")).displa\n
\n\n###[ Ethernet ]### \n dst = 01:00:5e:00:00:16\n src = 7a:40:61:dc:b9:5b\n type = IPv4\n###[ IP ]### \n version = 4\n ihl = 6\n tos = 0xc0\n len = 48\n id = 0\n flags = DF\n frag = 0\n ttl = 1\n proto = igmp\n chksum = 0xc7ef\n src = 203.0.113.1\n dst = 224.0.0.22\n \\options \\\n |###[ IP Option Router Alert ]### \n | copy_flag = 1\n | optclass = control\n | option = router_alert\n | length = 4\n | alert = router_shall_examine_packet\n###[ Raw ]### \n load = '\"\\x00\\x14\\x05\\x00\\x00\\x00\\x02\\x04\\x00\\x00\\x00\\xe0\\x00\\x00\\xfc\\x04\\x00\\x00\\x00\\xe0\\x00\\x00\\xfb'\n\n
Now we can try to ping an address on the link:
\nping 203.0.113.2\n
\nWhich gives:
\n\nffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102\nffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102\nffffffffffff7a4061dcb95b080600010800060400017a4061dcb95bcb007101000000000000cb007102\n\n
The frames are decoded by ScaPy as:
\n\n###[ Ethernet ]### \n dst = ff:ff:ff:ff:ff:ff\n src = 7a:40:61:dc:b9:5b\n type = ARP\n###[ ARP ]### \n hwtype = 0x1\n ptype = IPv4\n hwlen = 6\n plen = 4\n op = who-has\n hwsrc = 7a:40:61:dc:b9:5b\n psrc = 203.0.113.1\n hwdst = 00:00:00:00:00:00\n pdst = 203.0.113.2\n\n
man netdevice
tunctl.c
However, not all VPN software are based on TUN/TAP.\nFor example, WireGuard\nis implemented in kernel and has its dedicated network interface type.\nMany tunnels based on PPP use pppd
which use pppX
\nnetwork interfaces. These are managed through /dev/ppp
\nand related ioctls). \u21a9\ufe0e
When using an Ethernet or Wifi interface, several different machines can\npossibly be reached directly through this interface:\nas a consequence, L2 addressing is needed\nwhich specify to which host on the LAN we are sending\na given frame.\nIn contrast, only a single machine is directly reachable\nthrough POINTOPOINT
network interfaces and there is thus no L2 addressing.\nTUN interfaces only transport L3 (IP) packets: they do not have L2 addressing\nand are thus POINTOPOINT
interfaces. \u21a9\ufe0e
Some guidance about configuring/fixing domain name resolution\nwith a corporate Virtual Private Network (VPN),\nespecially OpenVPN and with systemd-based Linux systems.\nThis configuration uses the internal/private corporate resolvers\nfor resolving internal/private domain names\nwhile using the ISP resolver for general domain names.\nThis might help if your VPN is struggling these days\nbecause of the COVID-19 threat \ud83d\ude37.
\nI've been helping some people to make corporate VPNs (OpenVPN-based)\nwork correctly, since yesterday.\nThe problem is the resolution of domain names:
\neither private internal domain names are not resolved because the\ninternal corporate resolver is used;
\nor general domain name resolution is slowish because the internal corporate\nresolver is used and the VPN is currently struggling with\nthe new load.
\nA good solution is to use the internal coporate resolver for internal domain\nnames but another (eg. the ISP) resolver for other domain names.\nDepending on your configuration and how you are running the VPN\n(CLI, NetworkManager, etc.), this might be already what is happening\n(you might find some information in this post in order to check this).
\nBelow are some instructions about one solution to make this work on\nsystemd-based Linux systems using systemd-resolved.\nThere are solution to achieve this without systemd-resolved
\n(I think it is possible to do that with dnsmasq
).
If you are using a Linux system with systemd-resolved
, a quick fix is\nsomething like this (your need to adjust these values):
# Configure internal corporate domain name resolvers:\nresolvectl dns tun0 192.0.2.53 192.0.2.54\n\n# Only use the internal corporate resolvers for domain names under these:\nresolvectl domain tun0 \"~foo.example.com\" \"~bar.example.com\"\n\n# Not super nice, but might be needed:\nresolvectl dnssec tun0 off\n
\nIf you don't have the resolvectl
commands, you might use instead:
systemd-resolve -i tun0 \\\n --set-dns=192.0.2.53 --set-dns=192.0.2.54 \\\n --set-domain=foo.example.com --set-domain=bar.example.com \\\n --set-dnssec=off # <- Not super nice, but might be needed.\n
\nFirst check if you have systemd-resolved
installed and running:
systemctl status systemd-resolved\n
\nCheck if you have the resolvectl
too:
resolvectl\n
\nAn alternative is to use systemd-resolve
:
systemd-resolve --status\n
\nCheck if have the resolve
mechanism enabled in nssswitch.conf
:
cat /etc/nsswitch.conf | grep ^hosts: | grep resolve\n
\nInstall the needed libraries. For example on Debian or Ubuntu AMD64 systems:
\ndpkg -l | grep libnss-resolve\n
\nNote: I am installing both the 64 bit and 32 bit libraries.\nIf you do not install the 32 bit libraries, proper DNS resolution might fail\nfor 32 programs.
\nIf all these prerequisites are satisfied, you can configure the DNS resolution\nwith something like:
\nresolvectl dns tun0 192.0.2.53 192.0.2.54\nresolvectl domain tun0 \"~foo.example.com\" \"~bar.example.com\"\n
\nAlternatively, if you don't have resolvectl
:
systemd-resolve -i tun0 \\\n --set-dns=192.0.2.53 --set-dns=192.0.2.54 \\\n --set-domain=~foo.example.com --set-domain=~bar.example.com\n
\nThis configures systemd-resolved to use the corporate internal resolvers 192.0.2.53 and 192.0.2.54\nfor resolving all domain names under foo.example.com
and bar.example.com
\n(eg. some-service.foo.example.com
). tun0
is the network interface reprensing the VPN.\nYou have to adjust these parameters (IP address, domain names).
Note: the ~
in front of a domain prevents the domain to\nbe added to the search list (i.e. trying to resolve the test
host name\nwill not search test.foo.example.com
if a ~
was used in front\nof foo.example.com
).
You can check your system-resolved configuration with:
\nresolvectl\n
\nIf you don't have resolvectl
:
systemd-resolve --status\n
\nYou can now test domain name resolution (assuming some-service.foo.example.com
\nactually exists):
resolvectl query some-service.foo.example.com\n
\nIf you don't have resolvectl
:
systemd-resolve some-service.foo.example.com\n
\nIf this works, you should have:
\n\nsome-service.foo.example.com: 192.0.2.22 -- link: tun0\n\n-- Information acquired via protocol DNS in 1.7ms.\n-- Data is authenticated: no\n\n
In my case, this failed because of:
\n\nsome-service.foo.example.com: resolve call failed: DNSSEC validation failed: failed-auxiliary\n\n
A quick/dirty fix is to disabled DNSSEC:
\nresolvectl dnssec tun0 off\n
\nIf you don't have resolvectl
:
systemd-resolve -i tun0 \\\n --set-dns=192.0.2.53 --set-dns=192.0.2.54 \\\n --set-domain=foo.example.com --set-domain=bar.example.com \\\n --set-dnssec=off\n
\nIf you are using OpenVPN directly from the CLI (not through Network Manager),\nyou can automate this with the --up
options.
For example:
\nopenvpn --config client.ovpn --script-security 2 --up ./manual-config\n
\nwhere ./manual-config
is a shell script such as:
#!/bin/sh\nset -e\nresolvectl dns $dev 192.0.2.53 192.0.2.54\nresolvectl domain $dev \"~foo.example.com\" \"~bar.example.com\"\nresolvectl dnssec $dev off\n
\nor
\n#!/bin/sh\nsystemd-resolve -i $dev \\\n --set-dns=192.0.2.53 --set-dns=192.0.2.54 \\\n --set-domain=foo.example.com --set-domain=bar.example.com \\\n --set-dnssec=off # <- Not super nice, but might be needed.\n
\nOpenVPN sets the dev
\nenvironment variable\nto the name of the VPN TUN/TAP\n(i.e. virtual) network device (eg. tun0
) when calling the hooks.
Don't forget to make this script executable:
\nchmod u+x ./manual-config\n
\nupdate-systemd-resolved
For OpenVPN, an alternative is to use the\nupdate-systemd-resolved
\nscript:
openvpn \\\n --config client.ovpn \\\n --up /etc/openvpn/update-systemd-resolved \\\n --down /etc/openvpn/update-systemd-resolved \\\n --down-pre \\\n
\nOn Debian and Ubuntu, this script is in the openvpn-systemd-resolved
package.
This automatically pulls the configuration advertised by the VPN server:
\nIf this fails, the previous section might help you pinpoint to the problem:
\nresolvectl
);resolvectl query some-service.foo.example.com
)resolvectl dnssec tun0 off
).A simple way to create IP over\nUDP tunnels using\nsocat
.
This is the protocol stack we are going to implement:
\n\n[ IP ]\n[ UDP ]\n[ IP ]\n\n
Warning: no protection!
\nThis does not provide any protection (encryption, authentication)!
\nIn order to create a tunnel, we must create a\nTUN\ndevice interface. A TUN device is a network device managed by a\nuserspace program:
\nWe would like to have a simple program which manages such a TUN device by\nencapsulating them over a UDP socket.
\nsocat
It turns out socat
can do this already!
On the first host:
\nsudo socat UDP:192.0.2.2:9000,bind=192.0.2.1:9000 \\\n TUN:10.0.1.1/24,tun-name=tundudp,iff-no-pi,tun-type=tun,su=$USER,iff-up\n
\nOn the second one:
\nsudo socat UDP:192.0.2.1:9000,bind=192.0.2.2:9000 \\\n TUN:10.0.1.2/24,tun-name=tundudp,iff-no-pi,tun-type=tun,su=$USER,iff-up\n
\nExplanations:
\nUDP
type creates a connect()
-ed UDP socket which means only the\npackets coming from this specific remote UDP endpoint will be accepted;iff-no-pi
sets the IFF_NO_PI
flag which disables some\nencapsulation;tun-type=tun
asks an IP-based devices (as opposed for\ntun-type=tap
for Ethernet-based device which would implement an\nEthernet over UDP tunnel);iff-up
sets the interface up;su=$USER
changes the user of the process (instead of using\nroot
).Now we can ping over the tunnel:
\nhost1:~$ ping 10.0.1.2\nPING 10.0.1.2 (10.0.1.2) 56(84) bytes of data.\n64 bytes from 10.0.1.2: icmp_seq=1 ttl=64 time=39.3 ms\n64 bytes from 10.0.1.2: icmp_seq=2 ttl=64 time=40.1 ms\n\nhost1:~$ ip route get 10.0.1.2\n10.0.1.2 dev tunudp src 10.0.1.1 \n cache\n
\nYou can add IPv6 addresses to the tunnel and it works as expected:\nboth IPv4 and IPv6 packets are sent over the same UDP socket and the\nversion field is used to distinguish them.
\nip fou
The other solution is to use ip fou
(for\nfoo over UDP):
modprobe fou\nip fou add port 9000 ipproto 4\nip link add name tunudp type ipip \\\n remote 192.168.2.2 local 192.168.2.1 ttl 225 \\\n encap fou \\\n encap-sport auto encap-dport 9000\n
\nWe can expect better performance as its handled completely in the\nkernel side. The downside is that you have to create two different\ntunnels (one for encapsulating IPv4 and the other for IPv6).
\n"}]}