RooJSolutions http://roojs.com/index.php/View.html en http://roojs.com/Roojscom/templates/images/roojs_square_logo_150.png RSS: RooJSolutions - /index.php 150 150 Roo J Solutions Limited is recruiting 2012-01-31 00:00:00 http://roojs.com/index.php/View/247/Roo_J_Solutions_Limited_is_recruiting.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> Since we have been very busy already this year, I have now almost completed the process of migrating from a Sole Proprietor into a Limited Company. Roo J Solutions Limited is now a registered Hong Kong Company. We are now looking for full or part-time staff (based in Hong Kong).&nbsp;<div><br></div><div>Please read the full post for details.&nbsp;</div> Free your data... seed webkit browser mirror button 2012-01-14 00:00:00 http://roojs.com/index.php/View/246/Free_your_data_seed_webkit_browser_mirror_button.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <div>One of the great things about the internet is the availability of cheap or free services online, so many clients are using gmail, dropbox, github etc. for their business operations. But all to often they forget that these services are often playing the oldest game in the technology industry. "Vendor Lock-in".</div><div><br></div><div>While the ones I mentioned are not to bad, you can cheaply and easily rescue or backup your data to another location, or move to an alternative provider. Not all of them are like that.&nbsp;</div><div><br></div><div>We are in the middle of a migration project from Netsuite (It's a SAS Oracle based ERP system) to Xtuple, which is a open source ERP system, based around postgresql. This is a slow and painfull migration, as there is no standard for ERP data, and exporting is slow and clumsy over SOAP. Anyway, as a plesant distraction from this large migration, the same client also wanted us to look at migrating from backpack, a 37 signals product.</div><div><br></div><div>Backpack, unlike all the SAS systems I mentioned has&nbsp;deliberately&nbsp;made it hard, or practically impossible to migrate from their services. The primary offering of backpack is a online file storage service that you can permit clients or suppliers the ability to do share files and folders. It is only web based (unlike dropbox or box.net), and there is no desktop client that you can use to access the files other than the web interface.</div><div><br></div><div>When I started looking at how the company could extract the data, I tried out a few of the classic tools, like wget and httrack however the strong use of javascript, and the convoluted login system with login keys ensured that those kind of tools did not work. The other requirement was the ability to organise the files into folder, by just mirroring the site, you would just end up with thousands of folders called asset/123123/ where the number is probably the UID of the database record.</div><div><br></div><div>So how to rescue the data... Read on for the trick..</div><div><br></div> Deleting the View and Controller.. 2011-12-15 00:00:00 http://roojs.com/index.php/View/244/Deleting_the_View_and_Controller.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> This is NOT a post for people who do not use MVC, Please delete your code, and write it properly.. Anyway, as anybody who has used or written a reasonable framework in PHP knows, MVC is pretty much the golden rule for implementation. There are a dozen frameworks out their based around the principles, with different levels of complexity.<div><br></div><div>My own framework was designed around those principles, and for many years worked perfectly for those classic display a crap load of HTML pages using information from a database. The Model (<a href="http://pear.php.net/package/DB_DataObject">DB_DataObjec</a>t's), View (<a href="http://pear.php.net/package/HTML_Template_Flexy">HTML_Template_Flexy</a>) and Controller (classes that extend <a href="http://www.roojs.com/mtrack/index.php/File/default/pear/HTML/FlexyFramework/Page.php">HTML_FlexyFramework_Page</a>) delivered pages. Designing sites basically involved gluing all these pieces together. As the sites grew over time, shared code usually ended up in the Models, and each page had a controller which might render the share templates. All was well, and code was reasonably easy to maintain and extend.</div><div><br></div><div>Now however almost all the projects I've worked on in the last few years use the <a href="http://www.roojs.org/index.php/projects/javascript.html">Roo Javascript library</a> (the ExtJS fork), and are built ontop of the Pman components (originally a project&nbsp;management&nbsp;tool, that grew into a whole kit of parts). &nbsp;One of the key changes in the way the code is written, is how little code is now done to get the information from the database to the end user.</div><div><br></div><div>Obviously the whole HTML templating is been thrown out the window, (other than the first primary HTML page), the whole user interface is built with Javascript, and generated by User interface builder tools. The interaction of the interface is handled by signals (listeners) on the Roo Javascript components. These in turn call the Back end (PHP code) and almost always retrieve JSON encoded data, and that is rendered using the UI toolkit.</div><div><br></div><div>When I first started moving to this development model, I tended to retain the previous idea of having multiple controllers to handle the Select/Create/Update/Delete &nbsp;actions, as time went rather than have multiple controllers for each of those actions, I would use a single controller to manage a single Model entity (like Product). POST would always update/modify the model, and GET would always just view and query the data.</div><div><br></div><div>Eventually I realized that since all these controllers where essentially doing the same thing, a single generic controller should be able to do everything that all these single controllers where doing. And so was born the <a href="http://www.roojs.org/mtrack/index.php/File/default/Pman.Base/Pman/Roo.php">Pman_Roo</a> class.</div><div><br></div><div>So Basically {index.php}/Roo/{TableName} provides generic database access, for the whole application. Most code development on the PHP side is now contained within the DataObject Models. This greatly enhances code reuse as similar code ends up closer together, Unlike before where shared code was moved from the controllers to the model when necessary, now most of the code starts off in the model. This speed project development up considerably not to mention the huge savings of not having to try and manipulate data into HTML.</div><div><br></div><h3>How does it work.</h3><div><br></div><div>A GET or POST request is recieved by the server either from Roo's Form/Grid/Tree or directly by Pman.Request(), a handy wrapper arround Roo.Ajax, that handles error messages nicely.</div><div><br></div><div>The &nbsp;request {index.php}/Roo/{TableName} checks that the tablename is valid, then goes on to do the following actions depending on the params supplied. The documentation in the Pman_Roo class is the most up-to-date documentation. and details what calls are made (if available) on the relivant dataobject.</div><div><br></div><div>Using the class, it is now possible to handle pretty much any Database related query without implement any controller, and easily managing data permissions.</div><div><br></div><div>Snapshot of current documentation is in the extended view.. (latest will be in the source)</div> What was I doing last night... Seed querying xscreensaver 2011-11-30 00:00:00 http://roojs.com/index.php/View/243/What_was_I_doing_last_night_Seed_querying_xscreensaver.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <div>Like quite a few developers, I earn income by selling my time, either packaged on a project, or by the hour. For this to work, keeping track of time is essential. Unfortuntally, like most creative people, I really enjoy hacking, but filling in timesheets just really doesnt do it for me... So read on for my solutions using gnome seed...</div><div><br></div> Watch-out PHP 5.3.7+ is about.. and the is_a() / __autoload() mess. 2011-09-02 00:00:00 http://roojs.com/index.php/View/242/Watchout_PHP_537_is_about_and_the_is_a____autoload_mess_.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <div style="font-size: 13px; "><div><div><span style="font-size: medium; ">Well, for the first time in a very long while I had to post to the PHP core developers list last week,&nbsp;unfortunately&nbsp;the result of which was not particulary usefull.</span><div><div><br></div><div>The key issue was that 5.3.7 accidentally broke is_a() for a reasonably large number of users.&nbsp;Unfortunately&nbsp;the fixup release 5.3.8 did not address this 'mistake', and after a rather fruitless exchange I gave up trying to persuade the group (most people on mailing list), that reverting the change was rather critical (at least pierre supported reverting it in the 5.3.* series).</div><div><br></div><div>Anyway, what's this all about, basically if you upgrade to any of these versions and</div><div><br></div><div><b>a) use __autoload()&nbsp;</b></div><div>or</div><div><b>b) any of your code calls is_a() on a string,&nbsp;</b></div><div><br></div><div>you will &nbsp;very likely get strange failures..</div><div><br></div><div><br></div><h3>The change in detail.</h3><div><br></div><div>in all versions of PHP since 4.2 the is_a signature looked like this</div><div><br></div><pre>bool is_a ( object $object , string $class_name )</pre><div><br></div><div>As a knock on effect from fixing a bug with is_subclass_of, somebody thought it was a good idea to make the two functions signature consistant, so in 5.3.7+ the signature is now</div><div><br></div><pre>bool is_a ( mixed $object_or_string , string $class_name )</pre><div><br></div><div>And to make matters worse, that change to the first 'object_or_string', will also call the autoloader if the class is not found.</div><div><br></div><div><br></div><h3>How is_a() has been used in the past.</h3><div><br></div><div>On the face of this, it would not seem like a significant change, however, you have to understand the history of is_a(), and why it was introduced. In the early days of PEAR (before PHP 4.2) there was a method called PEAR::isError($mixed), which contained quite a few tests to check if the $mixed was an object, and was an instance of 'PEAR_Error'. A while after PHP 4.2 was released, this was changed to use this new wonderfull feature, and basically became return is_a($mixed, 'PEAR_Error').</div><div><br></div><div>Since PEAR existed before exceptions (and is still a reasonable pattern to handle errors), It became quite common practice to have returns from methods which looked like this.</div><div><br></div><pre>@return {String|PEAR_Error} $mixed &nbsp;return some data..</pre><div><br></div><div>So the callee would check the return using PEAR::isError(), or quite often just is_a($ret,'PEAR_Error'), if you knew that the PEAR class might not have been loaded.&nbsp;</div><div><br></div><div>So now comes the change and let's see what happens.</div><div><br></div><h3>The __autoload() issue.</h3><div><br></div><div>Personally I never use __autoload, it's the new magic_quotes for me, making code unpredicatable and difficult to follow (read the post about require_once is part of your documentation). But anyway, each to their own, and for the PEAR packages I support I will usually commit any reasonable change that helps out people who are using autoload.</div><div><br></div><div>So there are users out there using autoload with my PEAR packages, as I quickly found last week. Quite a few of these packages use the is_a() pattern, and the users who had implemented __autoload() had very smartly decided that calling autoload with a name of a class that could not or did not exist was a serious error condition, and they either died, or threw exceptions.</div><div><br></div><div>Unfortunatly, since is_a() was sending all of the string data it got straight into __autoload(), this happened rather a lot. Leading to a run around hunt for all calls to is_a(), and code changes being put it to ensure that it never puts a string in the first argument.</div><div><br></div><div><br></div><h3>The is_a(string) issue</h3><div><br></div><div>While I'm not likely to see the autoload issue on my code, I'm not sure I really appreciate having to fix it so quickly without a timetable to change it. The other change that may cause random, undetectable bugs is the accepting a string.</div><div><br></div><div>imagine this bit of code</div><div><br></div><pre>function nextTokString() {&nbsp;<br>&nbsp; &nbsp; if (!is_string($this-&gt;tok[$this-&gt;pos])) {<br> return PEAR::raiseError('....')<br>&nbsp; &nbsp; }<br>&nbsp; &nbsp; return $this-&gt;tok[$this-&gt;pos++];<br>}</pre><div><span class="Apple-tab-span"></span></div><div><br></div><div>... some code..</div><pre>$tok =$this-&gt;nextTokString()<br>if (is_a($tok,'PEAR_Error')) {<br>&nbsp; &nbsp; return $tok;<br>}</pre><div>... do stuff with string.</div><div><br></div><div>Now what happens if the token is 'PEAR_Error', is_a() will now return true. The big issue with this is that unless you know about the is_a() change, this bug is going to be next to impossible to find.. No warning is issued, is_a() just silently returns true, where before it just returned false.</div><div><br></div><div>I was hoping that PHP 5.3.9 would go out the door with this reverted, or at least a warning stuck on string usage of is_a(), but nope, none of my efforts of persuasion appear to have worked.</div><div><br></div><div>While I do not think the change is particularly necessary (as the use case for the new signature is very rare, and acheivable in other ways), I think reverting this change before PHP 5.3.7+ went into major deployment is rather critical. &nbsp;(yes it can take months before PHP releases start commonly arriving on servers). Then if it's deemed a necessary change (by vote) then go for it in 5.4...&nbsp;and add &nbsp;a warning in the next version in the 5.3 series..</div><div><br></div><div>&nbsp;&nbsp;</div><div><br></div><h3>Anyway the fixes / workaround:</h3><div><br></div><div>The simplest fix is to prepend tests with is_object</div><div><br></div><div>eg.&nbsp;</div><div><br></div><pre>if (is_a($tok,'PEAR_Error')) {</pre><div><br></div><div>becomes</div><div><br></div><pre><s>if (is_object($tok) &amp;&amp; is_a($tok,'PEAR_Error')) {</s></pre><pre><br></pre><pre><span><pre>if ( $tok instanceof PEAR_Error)) {</pre><div><br></div></span></pre><div><br></div><div>While you could start looking at the code and determining if you really need to prefix it with is_object(), the reality is unfortuntaly it may be simpler to stick this extra code in, just in case you start delivering strings where objects where expected.</div><div><br></div><h3>Update</h3><div><br></div></div></div></div></div><div style="font-size: 13px; ">This has been fixed in 5.3.9, however part of this derives from some confusion over instanceof</div><div style="font-size: 13px; "><br></div><div style="font-size: 13px; ">When PHP5 was released and added instanceof, doing this when the class did not exist caused a <b>fatal error</b>.</div><div style="font-size: 13px; "><br></div><div style="font-size: 13px; "><pre>if ( $tok instanceof Unknown_Class ) {</pre></div><div style="font-size: 13px; "><br></div><div style="font-size: 13px; ">However, this was changed in 5.1 to not cause a fatal error, The documentation is not totally clear on this, especially for anyone who used PHP 5.0.*.&nbsp;</div><div style="font-size: 13px; "><br></div><div>Unfortunately, since migration times are slow, supporting 5.0-5.1 is a reality of life for anyone writing libraries (actually Most of the libraries I write for still provide support for PHP4). So using any 'new' feature of the language basically prevents you from supporting older version of PHP with new code.</div><div style="font-size: 13px; "><br></div><div style="font-size: 13px; ">In this case, PHP5 usage has slipped below 0.3% so removing support for this should be fine.&nbsp;</div><div style="font-size: 13px; "><br></div><div style="font-size: 13px; "><br></div> Cli parsing in FlexyFramework, PEAR Console_GetArg 2011-08-11 00:00:00 http://roojs.com/index.php/View/241/Cli_parsing_in_FlexyFramework_PEAR_Console_GetArg.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <font size="2">&nbsp;And another rare article get's published, I've been slacking off posting recently. As I've been busy getting some interesting sites online. The biggest being a rather fun viral advertising campaign on facebook<a href="http://www.facebook.com/deargoodboy"> www.facebook.com/deargoodboy</a>. Which I ended up project managing, after originally only committing to do the facebook integration.</font><div><font size="2"><br></font></div><div><font size="2">Anyway back to the open source stuff. One of the tasks I've had on my todo list for a while is revamping the CLI handling of my framework, Which probably has a tiny following, but is one of those increadably simple, yet powerfull backbones to all my projects.</font></div><div><font size="2"><br></font></div><div><font size="2">While this article focuses on the changes to the framework, it should also be of interest to others developing frameworks, and anyone interested in using PEAR's<a href="http://pear.php.net/Console_GetArg"> Console_GetArg.</a></font></div><div style="font-size: 13px; "><br></div> Gtk3 introspection updates and Unusable Unity.. 2011-04-25 00:00:00 http://roojs.com/index.php/View/236/Gtk3_introspection_updates_and_Unusable_Unity.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <div><font size="2">Well, as Gnome 3 is out, it has to be tested.&nbsp;Luckily&nbsp;I've not got a huge deployment to sort out, but as I have a few applications that use Gtk, I thought it was about time I upgraded one of my machines to see what chaos I will have to deal with in the future.</font></div><div><font size="2"><br></font></div><div><font size="2">So it was one of my Ubuntu boxes that got the pleasure of a Natty and <a href="https://launchpad.net/~gnome3-team/+archive/gnome3">Gnome3 PPA</a> upgrade. (I use debian on my other development box, which actually got destroyed last week with a complete disk failure, although I suspect the motherboard may have problems... It's getting old like me...)</font></div><div><font size="2"><br></font></div><div><font size="2">Upgrading to Natty is not to bad, from what I remember it only took a small amount of brain surgery to get it to boot correctly after the upgrade. But once up, you get the pleasure of the Unity desktop. My first impressions where not to hot on unity, my wife uses it on her netbook, it's great there, after the initial shock of me upgrading without her knowing, she actually said it was alot better than compiz. Although she missed the special effects.</font></div><div><font size="2"><br></font></div><div><font size="2">But after using Unity on the big screens, it just became unbearable. Detached menus may seem like a cool idea, and are quite handy on a netbook, but they are an absolute nightmare when using things like gimp on dual head full HD monitors, my wrists hurt after a few minutes....&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2">Along with the removal of the Applications/Places/System menu's which while klunky are still handy for quickly finding applications. A classic example of this Alleyoop Memory Checker, a very nice wrapper around valgrind. In the Unity world if you do not know the name of the application, then finding it is a huge mouse journey around big icons.&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2">As for the left icon menu, all I can say is that I'm not the worlds best designer (although at least I did study it), but it's so graphically noisy that it unusable. It's basically a bad re-invention of Docky/Cairo Dock, which do far better jobs at providing a similar task role.&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2">So after all that I did try and get gnome-shell going, but unfortunatly the Gnome3 PPA build is not currrently working, and also has a rather nasty habit removing all usable desktop enviroments. I ended up adding xterm to one of the /etc/Xorg/X.sessiond files and starting up gnome-panel, mutter and docky to produce a usable desktop for the time being, while I wait to test out the latest gnome-shell.</font></div><h3><font size="2">So on with the harder stuff.. - Gtk3 and introspection.</font></h3><div><font size="2">One of the key applications I use to develop is<a href="http://www.roojs.org/blog.php/app.Builder.js.html"> app.Builder.js</a> , it's a drag/drop interface to build web applications, that also allows you to fill in all the code and associate it clearly with the element and event occuring. It's written in Javascript, and uses Gnome seed to run on the desktop. As I've mentioned before Seed is a bridge layer between the Webkit Javascript engine, and Gobject-introspection, the now standard way to interface Gnome/Gtk/Glib etc. projects to non-C languages, eg. Python, Javascript (and others...)</font></div><div><font size="2"><br></font></div><div><font size="2">With the introduction of Gtk3, GObject introspection has also been updated, and the updated mix between the two had quite a few knock on effects to the builder I had written using Gtk2 and pre-0.9 versions of Gobject introspections. Heres a general summary of the changes.</font></div><h3><font size="2">TreeIter and TextIter&nbsp;</font></h3><div><font size="2">The latest version of GObject introspection has a feature called caller allocates, this basically means that previously with Seed we had to create an instance of a TreeIter, the the TreeIter call would be of type 'inout' (eg. the Iter would be sent into the method, and returned out)</font></div><div><font size="2"><br></font></div><div><font size="2">eg. &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;</font></div><pre><font size="2">var iter = new Gtk.TreeIter();<br></font><font size="2">model.get_iter_from_string(iter, path);<br></font><font size="2">// iter would now contain the tree iter for that path..</font></pre><div><font size="2"><br></font></div><div><font size="2">In newer versions, the iter is an 'out' value, which means you have to create an object for the iter to be added to. eg.</font></div><pre><font size="2">var iret = {};<br></font><font size="2">model.get_iter_from_string(iret, path);<br></font><font size="2">// iret.iter now contains the tree iter.</font></pre><div><font size="2"><br></font></div><h3><font size="2">TreeSelection&nbsp;</font></h3><div><font size="2">since the get_selected method for a GtkTreeSelection now has 2 out values, the call has change from</font></div><div><font size="2"><br></font></div><div><font size="2">OLD:</font></div><pre><font size="2">var iter = new Gtk.TreeIter();<br></font><font size="2">selection.get_selected(model, iter);</font></pre><div><font size="2"><br></font></div><div><font size="2">NEW</font></div><pre><font size="2">var sret = {};<br></font><font size="2">selection.get_selected(sret);<br></font><font size="2">// sret now contains { model: **THE MODEL**, iter: **THE ITER** }</font></pre><div><font size="2"><br></font></div><h3><font size="2">TreeModel get_value</font></h3><div><font size="2"><br></font></div><div><font size="2">Since get_value does not have a return value, seed with return the 'out' values as the return object.</font></div><div><font size="2"><br></font></div><div><font size="2">OLD:</font></div><pre><font size="2">var value = new GObject.Value('');<br></font><font size="2">model.get_value(iter, 2, value);<br></font><font size="2">print(value.value);</font></pre><div><font size="2"><br></font></div><div><font size="2">NEW</font></div><pre><font size="2">var str = model.get_value(iter, 2).value.get_string();<br></font><font size="2">print(str);</font></pre><div><font size="2"><br></font></div><h3><font size="2">Drag pixmap becomes surfaces</font></h3><div><span style="font-size: small; ">This is a pure Gtk3 API change (BC break)</span></div><div><font size="2"><br></font></div><div><font size="2">OLD:</font></div><pre><font size="2">var pix = widget.create_row_drag_icon ( path);<br></font><font size="2">Gtk.drag_set_icon_pixmap (ctx, pix.get_colormap(), &nbsp; pix, &nbsp;null, ..... )</font></pre><div><font size="2"><br></font></div><div><font size="2">NEW:</font></div><pre><font size="2">var pix = widget.create_row_drag_icon ( path);<br></font><font size="2">Gtk.drag_set_icon_surface(ctx, pix);</font></pre><h3><font size="2">Drag drop data passing..</font></h3><div><font size="2"><br></font></div><div><font size="2">The drag drop signals appear to work ok, however I've not managed to get the data to go back and forth,&nbsp;</font></div><div><font size="2">a quick workaround is to just use some form of global variable to store the current dragged item (I doubt you will get more than one dragged item at once..)</font></div><div><font size="2"><br></font></div><h3><font size="2">Drag drop API</font></h3><div><span style="font-size: small; ">alot of these appear to have played musical chairs.</span></div><pre><font size="2">GtkWidget.prototype.drag_source_set -&gt; Gtk.drag_source_set</font><font size="2"><br></font><font size="2">Gtk.drag_source_set_target_list -&gt; GtkWidget.prototype.drag_source_set_target_list<br></font><font size="2">Gtk.drag_dest_set -&gt; GtkWidget.prototype.drag_dest_set</font></pre><div><font size="2"><br></font></div><h3><font size="2">Internal Seed changes</font></h3><div><font size="2"><br></font></div><div><font size="2">I've added a few more fixes to Seed in the last few weeks, mostly to handle compiling correctly and detecting the correct version of introspection. for the most part it's working fine, however I'm still a bit baffled by a Glib memory corruption bug, which occured after multiple model.set_value and model.get_value calls. After running valgrind, I managed to stop the corruption occuring by increasing the allocated size for a struct by 1 byte&nbsp;</font></div><div><font size="2">Around line 546 and 640 of seed-engine.c the change goes something like this.</font></div><pre><font size="2">- &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;out_args[n_out_args].v_pointer = g_malloc0 (size);<br></font><font size="2">+ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;out_args[n_out_args].v_pointer = g_malloc0 (size+ 1);</font></pre><div><font size="2"><br></font></div><div><font size="2">Arround line 738 of seed-structs.c &nbsp;</font></div><pre><font size="2">- &nbsp;object = g_slice_alloc0 (size);<br></font><font size="2">+ &nbsp;object = g_slice_alloc0 (size +1);</font></pre><div><font size="2">&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2"><br></font></div><div style="font-size: 13px; "><br></div> How to spam in PHP.. 2011-04-10 00:00:00 http://roojs.com/index.php/View/233/How_to_spam_in_PHP.html <a href="http://roojs.com/index.php/View.html">Article originally from rooJSolutions blog</a><br/> <span style="font-size: small; ">Well, after having written a <a href="http://www.mailfort.com">huge anti-spam system</a>, it now time to solve the reverse problem, sending out huge amounts of email. Only kidding, but the idea of scaling email sending using PHP is quite interesting.</span><div><font size="2"><br></font></div><div><font size="2">The reason this has been relivant in the last two weeks is two fold, first off, my slow and sometimes painfull <a href="http://roojs.com/mtrack/">rewrite of mtrack</a> has got to the point of looking at email distribution. Along with this I have &nbsp;a project that needs to distribute press releases, and track responses. Since both projects now use the same underlying component framework (<a href="http://www.roojs.com/mtrack/index.php/Browse/default/Pman.Core">Pman.Core</a> and <a href="http://www.roojs.com/mtrack/index.php/Browse/default/Pman.Base">Pman.Base</a>). It seemed like an ideal time to write some generic code that can solve both issues.</font></div><div><font size="2"><br></font></div><h3><font size="2">Classic mailing, you press, we send...</font></h3><div><font size="2">I've forgotton how many times I've written code that sends out email, pretty much all of it tends to be of the varient, that the user of the web applicaiton presses a button, then the backend code generates one or many emails, and sends it out. Most frequently using SMTP to the localhost mailserver.</font></div><div><font size="2"><br></font></div><div><font size="2">In most cases this works fine. You might run into trouble if your local mailserver is down or busy, but for the most part it's a very reliable way to send out less than 10 emails in one go.</font></div><div><font size="2"><br></font></div><h3><font size="2">Queues and bulk sending</font></h3><div><font size="2">One of my associates makes a tiny amount of money by offering the service of sending out newsletters and news about bar's and restaurants, to do this he bought a commercial PHP package, which I occasionally have the annoying task of debugging and maintaining. What is interesting about this package are the methods it uses to send out email. Basically once you have prepared a mailout, and selected who it goes to, it creates records in a table that goes something like this:</font></div><pre><font size="2">User X &nbsp;| Mailout Y<br></font><font size="2">123 &nbsp; &nbsp; | 34<br></font><font size="2">124 &nbsp; &nbsp; | 34<br></font><font size="2">...</font></pre><div><font size="2">There are two methods to then send these mailouts, first is via the web interface, that uses a bit of ajax refresh to keep loading the page and send out a number of emails in on go (eg. 10 at a time). or there is the cron version that periodically runs and tries to send out all the mails in that table.</font></div><div><font size="2"><br></font></div><div><font size="2">This method always sends to the localhost mailserver, and let's that sort out the bounces, queuing, retry etc. It has a tendancy to be very slow , and use up a huge amount of memory if sending out huge volumes of email. Most of it get's stuck in the mailserver queue, and the spammer has no real idea if the end users might have recieved it. If the mailserver get's stuck or blocked, the messages can often sit in the queue until they expire 2 days later, by which time the event at the bar may have already occurred.</font></div><div><font size="2"><br></font></div><h3><font size="2">The MTrack way</font></h3><div><font size="2">I'm not sure if I mentioned before, but I was intreged by the method used by mtrack when I first saw it. For those unaware of what <a href="http://bitbucket.org/wez/mtrack/">mtrack</a> is, it's a issue tracker based on trac. One of it's jobs is to send out emails to anyone 'watching' &nbsp;a project/ bug etc.&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2">From my understanding of what mtrack was doing (the original code has long been forgotten and removed now). Is that it set up a 'watch' list, eg. Brian is watching Project X, and Fred is watching Issue 12.&nbsp;</font></div><div><font size="2"><br></font></div><div><font size="2">When Issue 12 changed, or someone committed something to Project X, no actual email was sent at that moment. This obviously removed a failure point on the commit or bug update, and if you had 100's of people watching an issue (like launchpad) for example, this would prevent the server hanging while it was busy sending all the emails.</font></div><div><font size="2"><br></font></div><div><font size="2">The unfortunate downside was that to make the notifications work a cron job was required, this cron job had to hunt and find all the changes that had occurend and cross reference that with all the people who may have been watching those issues. The code for which was mindblowingly complex, and i suspect was a port of the original trac code.</font></div><div><font size="2"><br></font></div><div><font size="2">As somebody who basically looks at really complex conditional code and wonders 'is that really the best way to do this', I really had to come up with an alternative.</font></div><div><font size="2"><br></font></div><h3><font size="2">Bulk mailing done right....</font></h3><div><font size="2">So to solve my issues with mtrack and the other project, I devised a system that was both simple and fast at the same time. Here's the lowdown.</font></div><div><font size="2"><br></font></div><div><font size="2">First off, for both the Mtrack and mailout system, they both generate the distribution list when the web application user pushes the button. So for Mtrack, somebody updates the ticket (adding a comment for example). And the controller for the ticket page basically does a few things</font></div><div><font size="2"><br></font></div><div><font size="2"><a href="http://www.roojs.com/mtrack/index.php/File/default/web.mtrack/MTrackWeb/Ticket.php?jump=#l215">a) </a>If you have modified the ticket owner (or developer) make sure they are on the 'watch list' or subscribers.&nbsp;</font></div><div><font size="2"><a href="http://www.roojs.com/mtrack/index.php/File/default/web.mtrack/MTrackWeb/Ticket.php?jump=#l241">b)</a> ask the watch list (the dataobject code fo<a href="http://www.roojs.com/mtrack/index.php/File/default/Pman.Core/DataObjects/Core_watch.php?jump=">r core_watch</a>) to generate a list of people to notify (in our <a href="http://www.roojs.com/mtrack/index.php/File/default/Pman.Core/DataObjects/Core_notify.php?jump=">core_notify</a> table), and make sure we do not send an email to the person filling in the form (as he knows he just did that and does not need to be reminded..)</font></div><div><font size="2"><br></font></div><div><font size="2">For the other mailout system, It also just generates elements in the core_notify table, actually since the database table for the distribution targets different in that application, we actually have a seperatea table called XXXX_notify, and using the joy's of <a href="http://pear.php.net/DB_DataObject">DB_DataObject</a> and Object orientation, that class just extends the core_notify table, from what I rembember the only bit of code in that class is var $__table = 'XXXX_notify', since the links.ini handles the reference table data.</font></div><div><font size="2"><br></font></div><div><font size="2">And now for the really cool part, sending the mails out. Obviously this is done via cron jobs (so as not to distrupt the user interface). The backend consists of two parts (pretty much how a mailserver works.). The first is the<a href="http://www.roojs.com/mtrack/index.php/File/default/Pman.Core/Notify.php?jump="> queue runner</a>. This basically runs through the notify table, and makes a big list of it's of what to send out. This uses the <a href="http://www.roojs.com/mtrack/index.php/File/default/pear/HTML/FlexyFramework.php?jump=#l944">ensureSingle</a>() feature of <a href="http://www.roojs.com/mtrack/index.php/File/default/pear/HTML/FlexyFramework.php">HTML_FlexyFramework</a>, to ensure only one instance of the queue can be running at once.</font></div><div><font size="2"><br></font></div><div><font size="2">Then rather than sequentially sending each email, it basically proc_open's a new PHP process to send each email. This enables the queue to send concurrently many emails, rather than relying on a single pipleline. The code monitors these sub processes, and ensure that only a fixed number are running at the same time. We do not want to look to much like a spammer to our ISP..</font></div><div><font size="2"><br></font></div><div><font size="2">The <a href="http://www.roojs.com/mtrack/index.php/File/default/Pman.Core/NotifySend.php?jump=">code that sends out the single email</a> can then use MX resolution, and send direct to the real email server, and log results (success, failure or try later.)</font></div><div><font size="2"><br></font></div><div><font size="2">Now to actually test all this....</font></div><div><font size="2"><br></font></div>