BQL lets us get away from having overly low TX rings and manage things with bytes in the driver, rather than in the ring. We should also (finally) be able to get away from artificially short txqueuelen with what I describe below…
The default QoS ‘prioritization’ scheme as supplied by openwrt simply is not the ‘right thing’ for modern high bandwidth environments. Also the unholy HFSC + SFQ + RED scheme in it is nearly impossible to understand. So…
By default it will be QFQ (the quick fair queue scheduler) + RED at line rate. This is working really well, at least at 100Mbit line rates. More testing of course, is required, and tuning RED for various line rates (10,100,1000) will also be required.
The biggest problem is that nobody runs at line rate anymore. So we need
a bandwidth test - presently netperf - to be able to run either
automatically, or from the gui, to set the upload rate, at the very
least. We need to run it twice - once for a brief duration to see what
shaping is happening, and once longer - to see what happens,
after things settle down.
When an upload parameter is generated, we will shape on top of QFQ + RED using CBQ or HTB, with a single set of QFQ qdiscs, most likely with a sub-qdisc of RED, as no better alternative is yet available. My experiments with SFB, for example, have resulted in it overly punishing most flows, not that I’m entirely willing to give up on it.
There will be no priorities per se’, only fairness between flows.
It’s my hope to otherwise preserve the properties of the existing
configuration gui and configuration files in openwrt. Configuring the
default txqueuelen, tx rings, and red right for various bandwidths
between 128k and 100Mbit is going to be
really hard…
The third potential shaper is ‘bittorrent mode’, which would do fairness between machines, rather than flows. Thus far the only way I’ve figured out how to implement this is with a pre-nat filter per ip filter on the above, and results in de-fqing all flows coming from one machine, which is not good. I think - but am not sure - I might be able to create a set of filters and hashes that lets you use qfq on a per machine basis and still promote fairness between a limited number of flows on each machine, but haven’t got there yet. So you’d get like 32 flows per machine to hash into, and 256 machines with unique buckets to hash against…
and/or a dscp filter would help with the hash buckets.
It would be nice to have a ‘run bandwidth test’ thing in the gui that can pull the results out of netperf (which does have a nice csv output mode)
I MAY try to bump up the clock rate to give CBQ something to work with. Regrettably I’ve been able to lock up CBQ regularly with various prototypes of this setup. Switching to HTB in a very limited mode now….
As these scripts will not be cerowrt specific - but merely require BQL and QFQ - it’s my hope that they can be tried on multiple gateway setups, such as the guruplug and/or x86 boxes, etc, to get a feel for the valid ranges. I’d love to see people try BQL and QFQ on desktops - it seems to help a lot on 100Mbit ethernet and a great deal on gigabit….
Also gargoyles’s setup should be looked at.
The various and sundry scripts are living in this repository at present: https://github.com/dtaht/deBloat
(although as I write the only working prototypes are the eqfq and eqfq.red scripts in the wip dir)