I like the sound of the optimizing that you have been performing on BLAS. I'm amazed, since I actually thought it was already very optimal.
We have actually been looking at the Boost / BLAS version. I wasn't aware that this was substantially different from other BLAS implementation, except being in C++.
What I was planning to do was strip from the Boost/BLAS stuff was the optimization for specific platforms. My original intention was not to create the fastest possible BLAS version (you custom install BLAS for your chipset for that); but to create something that has all the same functionality (names, arguments, etc), but allows people to very quickly incorporate it into their project - in much the same was as we hope the CodeCogs code is (i.e. atomic). Size and ease of use being more important than Speed.
Our original thoughts were to try to clean up the Blas library automatically; so when an update to Blas is issued, we can incorporate that quickly onto the system. For this we need to look carefully at how the code is written (looking for #ifdef etc); and select the platform implementation that is best across all platforms - they strip out the rest using something like Perl. I don't know how easy this would be to achieve. Any thought?
In terms of which functions. Anything and everything. The trouble (or benefit depending on how you look at it), is that vast range of different users use CodeCogs, from pure maths, to physics to engineering. I couldn't possible second guess what they all might want one day. However we never aim to run before we can walk, so we typically start uploading the smaller, simpler components (esp those that other functions depend on) first, and work out to the larger more complex stuff. On the whole though, we all work on the areas that most interest us and fits with work were doing else where.
Thinking of other vector libraries. Another great system is the Blitz library, which is probably faster than BLAS. However its massively dependent on templates, is less used and we also felt was a little harder to break into individual parts and post onto a system such as CodeCogs. Though it still possible and something we're considering.
Cheers
Will.
Login