.todo

                          ~~~~~~~~ TODO  ~~~~~~~~

* If an agent hasn't started after 20 seconds (ie there are no keepalives)
  set the state to dead.

* Have an top like tool. Send the agent state to a queue so that anything
  can attach to the queue and do with it what it will. If this is done on a
  topic queue with fanout then a tool would be able to get the state of every
  agent in the network. Obviously if nothing is listening on the queue then no
  messages would be sent.

* Agents should be namespace-able. The agency and smithctl should understand
  this. Agents should be controllable on a namespace basis, e.g. shutdown all
  agents in a particular namespace. Consider using vhosts.

* Allow an agent to verify that it's the same version as that on disc. Maybe
  implement an autoload system. Look at shogun(?)

* Think about whether to not send messages to agents if they aren't alive.
  At the moment this is a little inconsistent: some commands check to see it the
  agent is alive and others check to see if there is anything listening on the
  queue.

* Have an option that shuts down all agents when the agency is closed. Maybe a
  -e option to stop agency

* Have a config item where I can put a list of agents that I can then start
  (or stop or kill for that matter) all at the same time. For example the command
  might be:

    smithctl group start <group name> smithctl group stop <group name>

  See "Agents should be namespace-able"

* The list command should check to see if the agent is actually alive and
  add that information to the output. [done - check]

* Have a flag for pop-queue that doesn't ack. [done]

* Don't send messages to queues that don't exist [send, smithctl messages,
  pop-queue.

* list command should display the state of the agent. [partially done]

* Think about the idea of a run list. Or is this already implemented in
  the state machine.

* Have a whitelist/blacklist mechanism listing agents that an agent can
  receive messages from.

* Have an errback if publish_and_receive times out. [done - check]

* Add a message retry mechanism. [done]

* Name reply queues based on the agent name. This would only work with the
  default queue.

* Delete queue should be part of smithctl.

* smithctl message should have some sort of printf formatting string
  that can be used to construct the protocol buffer message.

* Change all log messages to use the block form instead of the argument passing.

* Have an option to not ack the pop-queue command. See Receiver#pop for a more
  detailed description. [done]

* Have a SMITH_LOAD_PATH environment variable that overrides the path set up in
  the config

* Have a path where encoder files can be loaded from. [done]

* Have something monitor the agents' state machine for signs of problems.
  For example if an agent is in the starting state for longer than a minutes or
  so

* Be able to change log level for individual loggers (ie per class). [done]

* Add to_s to the Encoders, so when I call inspect on the message it
  gives a proper representation.

* Setup a dead letter queue in case of error. Have a look at the dead letter
  stuff in rabbitmq.

* The number of threads should be configurable programatically
  queues per process.

* Have message counters for every agent. [partially done]

* Set default config for all options that make sense (which is most of them).

* Add checking to the config. For example the logging class can only accept File
  or Stdout appenders but this is not checked for.

* Think about a generator to write the directory structure and what that
  directory structure might look like. Change Smith.root to the root of directory
  structure. You will need to think about what to do with the agent_path (ie the
  fact that multiple paths are allowed).

* I've added a method called add_agent_load_path to the bootstrapper. When
  thinking about the above make sure you check whether that method fits.

* Think about how to make ./bin/send a general tool. Maybe write some rake
  tasks.

* Make an equivalent of ripl-rails: ripl-smith

* Add command to smithctl to return the queue names an agent is using.

* Add the ability to clone an agent. See next point (It looks as if the agent
  lifecycles ...).

* It looks as if the agent lifecycles are keyed on name (see
  Agency#acknowledge_start & Agency#acknowledge_stop). This should be the pid
  or maybe both but just using name I can't have more than one agent which
  stops things like cloning.

* Add code to verify the protocol buffer payload type. It should go in
  Sender#_publish. [done - needs checking]

* Message headers:
  * sent time,
  * message checksum.

* Add the message_id to all log messages.

* Add i18n. It will also make error messages more consistent.

* Make sure application specific queues don't have the smith namespace.

* Allow the agent to be configured so auto_ack is off. auto_ack is nice when
  you are starting out but in high throughput applications it's unlikely to
  be the right thing to do so the auto_ack => false in the queue declaration
  is noisy.

* Add command line options to the agency:
  * dump the config.
  * daemonise the agency (and create the pid file.)
  * specify the pid file.
  * specify the environment
  * specify a different config file.
  * start the agency with a clean db
  * dump a specific queues' acl.

  In fact look at smith1 agency to see what options it had.

* Write a startup script.

* Have an option to freeze an agent. Not sure of a use case but it could be
  quite useful. It might be quite complicated to implement though.

* Clean up the pb cache handling. Have something like:

    Smith.set_acl_path

  oar maybe add it to the Smith run, Smith is useless without it.

* Have a pb cache reload mechanism at the moment you need to restart the agency.

* Think about passing a class to the receiver. This might be a nice way of
  structuring the message handlers.

* Add support for connecting to rabbitmq using ssl with certificates, see
    http://hg.rabbitmq.com/rabbitmq-auth-mechanism-ssl/file/default/README

* Fix incorrect message handling (if an acl gets sent to an agent that can't
  handle it). At the moment it doesn't catch the exception and the agent dies. It
  should probably send it a dead letter queue. Either way at the moment the
  message has to be manually removed from the queue.

  It's actually a bit tricky to know what to do here: should I assume the agent
  is correct and that the message is in error in which as the message should be
  sent to the dead letter queue; or should I assume the message is correct and
  the agent has a bug in which case the current behaviour is correct.

* Add warning that if there is an error of type:

    Channel level exception: PRECONDITION_FAILED - unknown delivery tag 2. Class
    id: 60, Method id: 80, Status code : 406

  it probably means that you've already acked the message.

* Sort the output of the agents command into alphabetical order. [done]

* Clean up some of the logging messages, for example:

  Messaging::Sender prints:
    Publishing to: agency.control. [message]: [agency_command] -> {:command=>"list"}

  where Receiver::Reply prints:
    Payload content: [queue]: a1bf3328e2f3db87 [message]: [string] -> No agents running
    
  this is inconsistent.

* Remove the commas in the agents command. [done]

* If nothing is listening on the reply queue to agency command then don't send
  the message back and delete the queue. I think this is going to be quite hard
  to do. I could provide an option to Reply#reply but that smells.

* Check all the commands for consistency.

* Split the functionality of smithctl. At the moment this simply sends any
  commands to the agency but some commands are better served by executing the
  command directly. For example removing a queue. 

  Removing a queue is quite problematic as I cannot get the queue attributes
  from the broker (not via amqp anyway -- at least I cannot see a way of doing
  it) but I can declare the queue with the :passive option which will use any
  existing queues but will cause a channel exception if the queue doesn't exist
  which makes life really awkward for the agency. If I do this in smithctl then
  a channel exception can be handled properly without impacting the agency.

  So in short have local and remote commands!

* Add next_ticks around things like requeue and general recovery.

* Put a lot more consideration into recovery. Fox example the agency will die if
  there is a queue error.

* Add a class to define the queue name and the type of message they can listen
  on. This can be used to define the queues in a centralised place which should
  avoid configuration errors.

* Separate out the Payload and ACL classes. Or at least check that they should
  be separated out.

  Have an ACL factory that returns decorated ACL objects and have the payload
  use the factory.

  If I set the content in the constructor is that good enough? How about
  instantiating nested objects?

* When an ACL is incomplete add code to check which field is missing. At the
  moment an exception is being raised, the problem is that I've overloaded the
  #inspect and it doesn't display what field is nil.

* Add message versioning.

* Add code to see if the reactor has started. If someone runs Smith.start it
  appears to hang; at least put a warning.

* Check to see if the acl directory actually exists.

* change encoders to be acl.

* Add version command to smith, obviously with support from the agency. [done]

* Add command to list what agents are in each group. Add a --groups option to
  the agents command 

* Add a headers hash to all messages so there are no
  AMQP::IncompatibleOptionsError due to the queue being defined with or without
  the headers hash.

* Add an option to pop that returns immediately. If I remove a lot of messages
  from a queue smithctl will timeout due to the time taken to delete the
  messages. Add an option that returns the number of messages to be removed
  without waiting for the messages to be removed.

* Add default on_requeue proc. Check to see if it makes sense to do so or throw
  and exception if it doesn't.

* Add a message TTL so that messages will get timed out after a period of time.
  This might be particularly useful for smithctl.

* Create a mixin that can be included in the agent to allow a key-value store.
  This would be useful for storing transient data that needs to be persisted.

* Check for the existence of the various paths in the config and log a message
  if it doesn't exist.

* Payload should be able to be instantiated from an undecoded message.

* Add better error reporting. When someone sets the message content that isn't a
  payload object and exception should be raised.

* If the list command is passed an agent just give details about that agent -
  maybe get rid of the stats command.


                          ~~~~~~~~ BUGS  ~~~~~~~~

* Add the internal (to smith) pb directory to the pb load path. [done]

* Smith::ACL::Payload.new(<payload type>, :from => payload) doesn't work for
  :default messages.

* Have a callback per message type. In effect implement pattern matching instead
  of using a case statement.

* Fix bug where I get an exception if the pb class has already been loaded:

  /usr/local/ruby-1.9.3-p0/lib/ruby/gems/1.9.1/gems/ruby_protobuf-0.4.11/lib/protobuf/message/message.rb:43:in `define_field': Field tag 1 has already been used in Smith::ACL::AgencyCommand. (Protobuf::TagCollisionError)  
          from /usr/local/ruby-1.9.3-p0/lib/ruby/gems/1.9.1/gems/ruby_protobuf-0.4.11/lib/protobuf/message/message.rb:23:in `required'
          from /var/cache/smith/pb/agency_command.pb.rb:18:in `<class:AgencyCommand>'
          from /var/cache/smith/pb/agency_command.pb.rb:16:in `<module:ACL>'
          from /var/cache/smith/pb/agency_command.pb.rb:15:in `<module:Smith>'
          from /var/cache/smith/pb/agency_command.pb.rb:14:in `<top (required)>'

  In practice this only happens when a pb is compiled.

* Fix force in protocol_buffer_compiler. There's no way to currently force the
  recompilation the pb files.

* Fix the incredibly slow Payload creation time. It's about 500 times slower
  than instantiating the protocol buffer itself.

* Make sure all time fields in pb files are integers.

* Check queue/exchange leaks. The ones I know of at the moment are:
  * if the publish_and_receive method is used and the message is not replied too
    then there is a exchange/queue leak.
  * I think there is another case but I cannot think of it.

* Start should check to see if the agent exists before starting it.