Performance Design

Some source threads & posts at

partial Summary of threads:

General objectives

  • initial loading of a patient's data into the GUI should never take more than 5 seconds as a hard limit
  • the vast majority of queries should return in a fraction of a second
  • make it workable now and optimize later is only adequate as long as we are not more than an order of magnitude away from desired performance
    • in any other case it is a serious design problem which is unlikely to improve by mere optimizing tweaks

Overview

  • A number of optimisations we know about: business object caching (we already do that in the clinical record), preloading the waiting list
  • Ideally, all DB communication should be done by a background thread, so the user can start typing notes etc. while stuff is loading. This is what we have asynchronous backend communication for.
    1. a complex / time consuming query is initiated asynchronously on the server, see here
    2. when ready, the server notifies the client that he can fetch the data now
    3. Ideally, we'll have a xml-rpc service or similar on the backend that pre-processes all data on the server in exactly the way the client wants it, and delivers that data then.

PostgreSQL strategies

  • emulate non-blocking writes and socket select using threads, as we already wrap the DB-API with gmPG.py
  • aggregate one-by-one value object queries into bulk retrieval
  • for things like get_lab_results() that might return arbitrarily large lists of results, somehow tie them to a cursor and NOT do fetchall() on it, but rather use fetchmany() et al
    Gnumed is hardcoded in run_ro_query()) which is why, in most cases, we do fetchall()
    • we cannot do much about cursor Declare, and cursor close that as it is done by the DB-API module
  • proper indexing/rewriting of slow queries
  • proper database tuning
  • in place of multiple simple queries, a more complex single query or (especially outside of localhost) some large denormalizing sql views to shorten query analyzer response time and cut down on network traffic
  • would pooling "write" connections help?
    • currently gnumed establishes a separate TCP connection for every commit transaction (transactions are committed at a connection level, a stupid limitation of the backend libraries). A connection does not allow isolated concurrent transactions, and pooling writeable connections would require disciplined formal release of the connection after the transaction has been completed. Maybe we should have another look at SQL Relay.

XML-RPC service

  • benefits:
    • a platform and language independent interface to the backend
    • a means of implementing server side business logic
    • for our Python client it also means two less 3rd party dependencies (pypgsql and mxDateTime)
  • caveats:
    • slow protocol with a lot of overhead
    • requires a lot of good design to make the performance acceptable to the end user

    • Depends on granularity of data
Topic revision: 09 Nov 2009, KarstenHilbert
 
Download.png
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback
Powered by Olark