doc/developer/WiFiDog_V2

Problems in V1

  • Too many connections to the authentication server (one per connected client every x minutes)
  • Remove ClientTimeout. All decisions regarding a client timeout should be decided by the auth server

benoit: There are two very different things: 1- When the gateway detects a client as being gone, which should happen locally (with a policy possibly downloaded from the auth server) 2- When a present client should be disconnected, a decision which should be taken by the auth server, but possibly in advance. See doc/developer/TokenArchitecture

  • wdctl restart adds a lot of dirty code for it to work. (if we download the configuration periodically from the auth server, we don't need the wdctl restart anymore)

acv: Probably ways to make it cleaner. Being able to *not* download configuration from server would be nice to folks wanting to make a more NoCatSplash like setup. Pulling config is obviously really nice to setup like ISF.

  • Firewall abstraction was done quickly

acv: Agree. Makes it quite hard for people to port to different firewalls.

  • No QoS support
  • Doesn't verify that the FW modules work (unless you use the shell script wifidog-init provided, Linux only)
  • Protocol is too simplistic, not flexible, proprietary
  • Wifidog doesn't open internet connection when auth server is not available (needs a policy when that happens.... allow all or deny all)

acv: I had some code to do that, not sure why it never got committed. Default behavior was defined in the config file.

  • Internal linked list COULD have synchronization problems with reality, suggest removing and parsing iptables output to see current LIVE state

acv: Maybe just have a verification thread that runs somewhat infrequently to reduce overhead or is process spawning performance no longer a problem to get iptables output?

philippe: if we fork once to run iptables -n -v -x --line-numbers -t filter -L every 5 minutes, it won't cost much. I know we can guarantee synchronization with the pthread mechanisms, but it's still code that "could" fail. If we can do without, I think we should.

Proposed V2 features

  • Configuration downloaded from auth server on startup (and periodically) including "who should be connected" (so it gets back to where it was on a restart)

acv: This ideally should be optional I think.

philippe: my idea is to replace the main logic of deciding who should be kicked out, allowed in, FW rules, QOS, by a call from the auth server. That way the gateway is "waiting" for commands, and just executes them, keeps the code small, clean but still flexible. Of course, if we want to support people who want to run without an auth server, we'll need to add a lot of code to do all that decision on the gateway too. It could be as simple as using the JSON "status return" packet as the configuration file, so we have one unique parser (and leave the "should be connected now" clients out.), but we'd still need to re-implement client timeout, etc. I think coding a simplistic auth server in C to run in parallel on the same box would be a better option to keep things standard and not try to accomodate all setups.

acv: A stand alone auth server in C isn't too bad of an idea. Certainly removes a lot of my objections.

  • QoS support
  • Versioned and more standard protocol
  • Better URLs (RESTful philosophy?)

Proposed architecture and ideas

Threads

Main program

  • Launches a "status" or "authserv" thread
  • It waits for the status thread to download the configuration succesfully before initializing networking and denying clients
  • Once initialized, it waits for connections on its port (redirected from firewall)
    • Handles traffic the same way as before (forward to login page... gets back a token later, tests the token with the auth server, applies firewall rules)

Status thread

  • Contacts the auth server and sends status every 5 minutes (default)
    • Wifidog uptime
    • System uptime
    • System free memory
    • System load average
    • System network interface list (ifconfig -a). (needed to select which interface wifidog will use)

philippe: Explain why this is needed?

  • SSID associated to the LAN interface... (wifidog doesn't _need_ to use a wireless interface for lan, but it could be fun to know what the ssid's are in case they change, etc. Only if it's available. If wifidog runs on a router without wifi, it could just "not send it".
  • Other stats...
  • Currently connected clients
    • Incoming bandwidth statistics
    • Outgoing bandwidth statistics
    • QoS settings
    • Specific firewall rules
  • If auth server can't be contacted for some reason, retry faster (30 seconds maybe)
  • If it failed contacting the auth server twice, apply the "what to do when the internet is down" policy (either "allow" or "reject but explain what's going on")
  • When it receives the status from the auth server, it will also receive a list of "who's supposed to be connected". It will do a DIFF of this with what it has to remove/add who should be there

http client thread

  • this thread is launched when a new connection is initiated... if 80 people (just saying) connect to the auth server, there will be 80 threads launched at the same time. it would be GREAT to have a threads pool, but would not be useful 99% of the time, and would take a long time to code and debug. For now, this thread forwards to the right spot, or validates tokens.

acv: WiFiDog isn't an high performance web server. The basic philosophy of Libhttpd is ultimately unsuited to handling connections asynchronously so a thread pool would merely mean that most of those 80 people get no reply whatsoever until someone else's done. Either that or libhttpd is re-architected around an async network core (select() or pool() based) and a number of page generation worker threads that do not handle any networking, just fill a memory buffer with their reply and flags it so the network core can flush it back.

philippe: I think we agree that we can keep the way it works as-is :)

Options on command line

The required values to set: --auth <authserver hostname> Authserver's hostname --lan <interface> LAN interface

The node ID will be (default) the mac address associated to the interface that the default route is using... (parsing netstat -rn). It should sleep and retry to parse in case the networking (pppoe for example) is not up yet.

acv: The ability to have stand alone gateway without downloading config would be nice. I think this philosophy of "download everything" makes the barrier to deployment significantly larger then it needs to be.

philippe: with the current auth server (all the dependencies, etc.) I agree. I think I would personally rather spend some effort coding a light solution in C that could be distributed straight with wifidog (why not?) with a small web interface to configure it.

When wifidog initiates its config it should get options like "this is the url to contact the login page"...: Ideas:

  • global bandwidth settings (max incoming and outgoing)
  • login page url
  • portal url
  • timers (push status, etc.)

Things tested (proof of concept, code, etc.)

JSON for protocol

XML or YAML would have been great, but I tried to use Syck ( http://whytheluckystiff.net/syck/) and it didn't seem trivial to use, it looks like it supports a stream parser, we need something more like a DOM to find values returned.

acv: How big of a library footprint does JSON have in KB (Non-issue-ish it seems, 36KB on osx 10.5 no funky dependencies)? What about PHP support (Requires PHP 5.2.0 for availability in a default install)?

philippe: Right, it's pretty small. We could do our own parser, but why reinvent the wheel for a few kb's?. Also, since there are other efforts to code an auth in other languages, I think we kind of need to go with a standard encapsulation like JSON this time. I'm a big XML fan myself, but that would be too much (size of lib, etc. again, unless we make our own parser...).

acv: Expat isn't too bad as library goes for XML parsing. I've used it before as long as you don't mind working with a stack it's fast and simple.

philippe: I just compiled expat on a linux box. once stripped: 130636 bytes, json-c (stripped) is 21848 bytes. Again, I'd love to go with XML for its formalism, but not if the parser is that big.

While I find it ugly to look at compared to YAML, JSON looks viable if a dependency on either PHP 5.2.0 or optional PHP5 module is OK. Library looks like it should build OK for C (sort of loath to get the whole libhttpd problems though).

acv: (To give YAML a chance:) Also what about spyc ( http://spyc.sourceforge.net/) to load YAML in PHP inistead of Syck? For the size of data structure we're likely to use high-performance is a non-issue. Also LibYAML ( http://pyyaml.org/wiki/LibYAML) appears to be the recommended C implementation.

philippe: I gotta say haven't tried libyaml because their page says "It's in an early stage of development" with current version from 1 year and a half ago. If it offers a way to parse and generate YAML as easy as I've done it with json-c, I don't see why we wouldn't use it.

JSON (with json-c-0.7  http://oss.metaparadigm.com/json-c/) gives us that in C and is quite elegant.

Here's how we can generate JSON:

struct json_object *status_object = json_object_new_object();
json_object_object_add(status_object, "wifidog_version", json_object_new_string(VERSION));
json_object_object_add(status_object, "protocol_version", json_object_new_double(2.0));
json_object_object_add(status_object, "node_id", json_object_new_string(node_id));
json_object_object_add(status_object, "fetch_config", json_object_new_boolean(1));

struct json_object *node_status_object = json_object_new_object();
json_object_object_add(node_status_object, "wifidog_uptime", json_object_new_int(25));
json_object_object_add(node_status_object, "sys_uptime", json_object_new_int(get_sys_uptime()));
json_object_object_add(node_status_object, "sys_loadavg", json_object_new_double(get_sys_loadavg()));
json_object_object_add(node_status_object, "sys_memfree", json_object_new_int(get_sys_memfree()));

char * json = json_object_to_json_string(status_object);

It returns the JSON string.

To parse:

struct json_object * json_object = json_tokener_parse(the_string);
struct json_object * value_json_object = json_object_object_get(json_object, "node_id");
printf("%s\n", json_object_get_string(value_json_object));

This will retrieve the string value of "node_id" at the first level in the tree.

Tests that the required firewall rules work (Linux)

These lines should test all the modules we need:

iptables -A INPUT -m mac --mac-source 00:00:00:00:00:00 -j ACCEPT
iptables -D INPUT -m mac --mac-source 00:00:00:00:00:00 -j ACCEPT
iptables -t nat -A PREROUTING -p tcp --dport 9999 -j REDIRECT --to-ports 2060
iptables -t nat -D PREROUTING -p tcp --dport 9999 -j REDIRECT --to-ports 2060

If this works, we should be good to go. I'm sure there are more to try (MARK if we still use it), but it's a start.

Firewall rules (Linux)

We used to use MARKs to tag what traffic is known, unknown or in validation.

Now, traffic should be simply known or unknown, specific deny rules (for example, nothing but 80) should be sent as "deny acl rules".

We should try to stop using MARKs, as we only have 255 (it wouldn't be trivial or good I think to track "who has which mark", and have a limite of 255).

Here's what I experienced, it's not complete but I'm posting anyway:

/* Everything from LAN to WAN interface */
iptables -t filter -N wd_lan2wan
/* Default rules (example, deny TCP 25 always) */
iptables -t filter -N wd_lan2wan_fromauth
/* Allowed clients (we can track outgoing traffic here */
iptables -t filter -N wd_lan2wan_clients
/* The last REJECT rule (unknown trafic) */
iptables -t filter -N wd_lan2wan_defaults
/* For incoming stats */
iptables -t filter -N wd_incoming_stats

/* Insert the "catch everything from lan to wan" */
iptables -t filter -I FORWARD 1 -i LAN_INTERFACE -o WAN_INTERFACE -j wd_lan2wan
/*
  * Insert the incoming stats rule _ON TOP_ (we'll RETURN in this chain so we still go to the next chains
  *
  * We do not seem to be able to track incoming traffic by mac address, needs the IP
  */
iptables -t filter -I FORWARD 1 -j wd_incoming_stats
/* 1. Global settings (drop going to port 25 for everyone)
iptables -t filter -A wd_lan2wan -j wd_lan2wan_fromauth
/* 2. Allowed clients and specific client rules (and outgoing bandwidth) */
iptables -t filter -A wd_lan2wan -j wd_lan2wan_clients
/* 3. Deny rule */
iptables -t filter -A wd_lan2wan -j wd_lan2wan_defaults

/* We can only do the redirect to local port 2060 in the PREROUTING or OUTPUT chains in the "nat" table */
iptables -t nat -N wd_redirect
iptables -t nat -I PREROUTING 1 -j wd_redirect

/* For pppoe to work properly..... make sure we have this, even though the router will probably have it, we just have to insert ourselves at the right places? */
iptables -t filter -A wd_lan2wan -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

/* Allow DNS please */
iptables -t filter -A wd_lan2wan_defaults -p tcp --dport 53 -j ACCEPT
iptables -t filter -A wd_lan2wan_defaults -p udp --dport 53 -j ACCEPT

/* Allow auth server */
iptables -t filter -A wd_lan2wan_defaults -d AUTHSERV_HOSTNAME -j ACCEPT

/* Base "reject" rule */
iptables -t filter -A wd_lan2wan_defaults -j REJECT

/* Redirect to local wifidog port 2060
iptables -t nat -A wd_redirect -p tcp --dport 80 -j REDIRECT --to-ports 2060

Almost all of this works, but we have to find the right recipe.

When allowing a client:

/* Track incoming stats (BY IP) */
iptables -t filter -A wd_incoming_stats -d 192.168.1.10 -j RETURN

/* Allow MAC address and count outgoing stats (by MAC and IP) */
iptables -t filter -A wd_lan2wan_clients -s 192.168.1.10 --match mac --mac-source 01:02:03:04:05:06 -j ACCEPT

/* Don't redirect anymore please (inserted on TOP) */
iptables -t nat -I wd_redirect 1 -s 192.168.1.10 --match mac --mac-source 01:02:03:04:05 -j ACCEPT

I'd much rather have only 1 rule to add to allow a client, but we need to count the stats. If we could only have TWO rules, that'd be neat, but then we'd have to MARK the traffic at some point I think if we want to avoid that REDIRECT rule. We could do it, but... that assumes (like before) that a specific "magic number" MARK is available (not in use).

QoS tests

TODO