Friday, 9 July 2010

Array Arrange By Function

PHP has a fantastically useful range of functions for handling Arrays. I found that there were a couple of situations that I was habitually facing which they didn't handle. These related to 2D arrays returned by SQL that I needed to rearrange in different ways. The first function is Array_Arrange_By() and is shown below:
function Array_Arrange_By($arr, $key, $key2 = null) {
      $out = array();
      if (!$arr) {
          return $out;
      }
      foreach ($arr as $k => $row) {
          if (is_array($key2)) {
              @$out[$row[$key]][] = $row;
          } else if ($key2) {
              $out[$row[$key]] = $row[$key2];
          } else {
              $out[$row[$key]] = $row;
          }
      }
      return $out;
}
This function will reindex this array in one of 3 ways:
array([0] => array('id' => 1,'b' => 'dog','cst' => 20),
      [1] => array('id' => 4,'b' => 'cat','cst' => 60),
      [2] => array('id' => 8,'b' => 'cat','cst' => 80),
      [3] => array('id' => 9,'b' => 'cat','cst' => 90));

var_export (Array_Arrange_By($a, 'id'));

Gives:

array (
  1 => 
  array (
    'id' => 1,
    'type' => 'dog',
    'cst' => 20,
  ),
  4 => 
  array (
    'id' => 4,
    'type' => 'cat',
    'cst' => 60,
  ),
  8 => 
  array (
    'id' => 8,
    'type' => 'cat',
    'cst' => 80,
  ),
  9 => 
  array (
    'id' => 9,
    'type' => 'cat',
    'cst' => 90,
  ),
)
It's very useful as the array returned contains the same information but is indexed by the 'id' field. Alternatively you might only want one of the non-id fields from the original array.
array([0] => array('id' => 1,'type' => 'dog','cst' => 20),
      [1] => array('id' => 4,'type' => 'cat','cst' => 60),
      [2] => array('id' => 8,'type' => 'cat','cst' => 80),
      [3] => array('id' => 9,'type' => 'cat','cst' => 90));

print_r (Array_Arrange_By($a, 'id', 'type'));

Giving:

array (
  1 => 'dog',
  4 => 'cat',
  8 => 'cat',
  9 => 'cat',
)
This might be a very useful lookup array. Finally sometimes you may want to perform the equivalent of a PHP level group by operation.
array([0] => array('id' => 1,'type' => 'dog','cst' => 20),
      [1] => array('id' => 4,'type' => 'cat','cst' => 60),
      [2] => array('id' => 8,'type' => 'cat','cst' => 80),
      [3] => array('id' => 9,'type' => 'cat','cst' => 90));

print_r (Array_Arrange_By($a, 'type', array()));

Gives:

array (
  'dog' => 
  array (
    0 => 
    array (
      'id' => 1,
      'type' => 'dog',
      'cst' => 20,
    ),
  ),
  'cat' => 
  array (
    0 => 
    array (
      'id' => 4,
      'type' => 'cat',
      'cst' => 60,
    ),
    1 => 
    array (
      'id' => 8,
      'type' => 'cat',
      'cst' => 80,
    ),
    2 => 
    array (
      'id' => 9,
      'type' => 'cat',
      'cst' => 90,
    ),
  ),
)
This array gives a really nice way to now generate a 2 Level table. In this case iterating through the different types and then each of the records with that type.

Test Page for Syntax Highlighting

I've been unsatisfied in the past about the results of posting code onto this blog in the past and I've long been meaning to add Syntax Highlighting to make it look better. Happily someone else has done the hard work. Thank you Alex for SyntaxHighlighter.
/**
* SyntaxHighlighter
*/
function foo()
{
  if (counter = 10)
      return;
  // it works!
}

Friday, 1 May 2009

Kohana Ajax Username Availability

I'm trying out some simple exercises with the Kohana framework. I want to make a website that has a really swish user experience. This exercise creates a user interface where new users will be told if the username they are entering in the registration form is available or taken as they are typing it. I'll be using Jquery as it allows for the functionality I want with a very minimal amount of code. Here's the Kohana controller code that checks to see whether a username is taken and returns a simple JSON response.
class Ajax_Controller extends Controller{

  public function checkUserName($username="")
  {
   $user = ORM::factory('user', $username);
   $msg = "<div class='good_feedback'>Available</div>";
   $disable = false;
   if ($user->loaded)
   {
       $msg = "<div class='bad_feedback'>Username Taken</div>";
       $disable = true;
   }
   echo json_encode(array('message' => $msg, 'disable' => $disable));
  }
}
The javascript in the html attaches a handler function to the keyup event on the username field so whenever the username changes the ajax request is created and then processed updating a feedback div and enabling or disabling the submit button.
$(document).ready(function() {
// do stuff when DOM is ready
$("#username").keyup(function() {
 jQuery.getJSON('/kohana/index.php/ajax/checkUserName/' + $(this).val(),
   function( json ) {
     $("#username_feedback").html( json.message );
     $("#submit").attr('disabled', json.disable  );
    }
 );
});
});
And the HTML isn't very complicated:
Username: 
<?php print form::input('username'); ?>
<div id="'username_feedback'"></div>

Sunday, 19 April 2009

New Challenges

Well having landed myself a new job I have a new set of challenges being presented to me. Some technical and some more directly related to humans. This new job is a permanent position. The first permanent job that I've had in the last 2 years. I wouldn't have moved away from contract positions except for the compelling economic reason of the economy being a bit rooted. I'm settling into the job and finding myself surrounded by the same sort of problem code and problem systems that I've seen around for the last 5 years. This is the first time though that I've seen so much broken and so many simple beautiful ways to fix the mess and yet I can see the path to a sane code base is going to be long with potential setbacks along the way. The code currently consists of a number of software engineering anti-patterns and rather than make a list of the programming anti-patterns I can see I'll just list all of the ones mentioned in Wikipedia and make a note as to whether they are there or not:

Programming anti-patterns

  • Accidental complexity: Introducing unnecessary complexity into a solution (YES)
  • Action at a distance: Unexpected interaction between widely separated parts of a system (NO)
  • Blind faith: Lack of checking of (a) the correctness of a bug fix or (b) the result of a subroutine (YES)
  • Boat anchor: Retaining a part of a system that no longer has any use (YES)
  • Busy spin: Consuming CPU while waiting for something to happen, usually by repeated checking instead of messaging (NO)
  • Caching failure: Forgetting to reset an error flag when an error has been corrected (NO)
  • Cargo cult programming: Using patterns and methods without understanding why (YES)
  • Coding by exception: Adding new code to handle each special case as it is recognized (YES)
  • Error hiding: Catching an error message before it can be shown to the user and either showing nothing or showing a meaningless message (YES)
  • Expection handling: (From Exception + Expect) Using a language's error handling system to implement normal program logic (NO)
  • Hard code: Embedding assumptions about the environment of a system in its implementation (NO)
  • Lava flow: Retaining undesirable (redundant or low-quality) code because removing it is too expensive or has unpredictable consequences (YES)
  • Loop-switch sequence: Encoding a set of sequential steps using a loop over a switch statement (NO)
  • Magic numbers: Including unexplained numbers in algorithms (YES)
  • Magic strings: Including literal strings in code, for comparisons, as event types etc. (YES)
  • Soft code: Storing business logic in configuration files rather than source code (NO)
  • Spaghetti code: Systems whose structure is barely comprehensible, especially because of misuse of code structures (YES oh god YES)
10 out of 15 anti-patterns. They obviously haven't been trying hard enough. Although honestly with the size of the code base and the amount of code duplication obscuring the business logic it's entirely possible that they have hidden some of the other problems in there. To be honest it's the amount of code duplication and the extraordinary lack of libraries for repeated tasks that makes me the saddest. And discussing some of the issues with my fellow programmers makes me realise that the process of education is going to be a long one. I've decided to concentrate on two main things:
  1. Code Reuse is a good thing - Don't repeat what you've done before because you'll never be able to fix it all again later.
  2. Database Abstraction Layers are a good thing in so many ways. Currently the codebase uses 3 different db access classes and has many hundreds of direct references to the database code. Oh and the SQL code is sprinkled liberally through the business logic code.
I think I'll have to make a conscious effort not to use this blog as a soap box to complain about bad programming but I think it'll come out at least some of the time. Once I find a good way to convince the other programmers of the idiocies of their ways I'll try to explain how I've done it here.

Sunday, 22 March 2009

Delegation and Interfaces 1.

Browsing through programming pages in Wikipedia has led me to the Delegation Pattern which in typical Wikipedia fashion has been explained quite clearly with examples. Does it solve my question of wanting to create a clear system of classes with a simple logical behaviour wrapped by a standard presentation layer. I'll sketch out something for this tomorrow.

Heirarchical Classes for code reuse

I was reading the other day about programming with classes. The article, by John K. Ousterhout (who is actually talking about scripting languges and programming in the next century) said that class inheritance and multiple level of classes didn't actually help that much with code reuse as you often the classes were not useful seperately and hence the code was more bulky and difficult to reuse. Which made me think about the codebase I'm starting to work on now. Classic 5 year old PHP codebase programmed by enthusiastic and very smart 'amateurs'. I've not met the people who compiled most of the code but it has just been lumped together over the years and as such is a comprehensive jumble of multi-thousand line long PHP scripts with the single page to a single file paradigm with only a minimal amount of code reuse. So since I've been hired to work on the code and not to rearchitect it I'm left with the puzzle of how to build something in the middle of the mess and untangle it as I go. So are classes the way to go? I think classes are an excellent way of compartmentalising functionality. So I think accessing a datastore - whether it be a database or a file with a particular format - should be done through a class. Classes seem to be used a lot in this codebase to group together similiar functionality. So there are classes full of static functions which provide functionality. This is good but it creates a question about when some of the functions should be grouped into different classes. And what form should these new classes take? Should they be subclasses of some helpful parent class so I don't repeat code? Or should that parent class actually not be a parent. Should the functionality in it just be pushed into a class which is used by all the new classes. Question for today How to tell if a class should be subclassed into a heirarchy of classes with different subclasses to provide different functionality? Discussion I'm not sure although I think the answer might lie in the wikipedia article for inheritance and particularly the quote, 'In most quarters, class inheritance for the sole purpose of code re-use has fallen out of favor. The primary concern is that implementation inheritance does not provide any assurance of polymorphic substitutability—an instance of the re-using class cannot necessarily be substituted for an instance of the inherited class'. So if my parent class and the child class cannot be substituted into the same bit of code to provide different functionality then probably it's not appropriate to structure your code like that. I think I agree with that. Conclusion The parent class should be an ancestor or perhaps a more primitive instance of a class and shouldn't exist just to provide a neat library of functions.