alkali 0.7.0

Alkali, or AlkaliDB, is a database engine. Code can be found on github.

If you know anything about databases then you know that any real database is ACID (Atomic, Consistent, Isolation, Durable). If you know anything about chemistry then you know that an alkaline substance is the opposite of an acidic one.

I think you can see where this is going.

So alkali is basically the opposite of an ACID database. Except Durable. If alkali is not durable then we want to hear about it as soon as possible.

So knowing this, why would you use alkali?

  • If you need a simple, stand alone, database-like library
  • If you need a document store
  • Minimal disk io
  • “Advanced” features like foreign keys
  • You’d like a database but must control the on-disk format
  • If a list of dict is a main data structure in your program
  • You often search/filter your data
  • You like Django and/or SqlAlchemy but don’t need anything that heavy
  • 100% test coverage

Plus alkali is really easy to use, if you’ve ever used the Django ORM or SQlAlchemy then alkali’s API will feel very familiar. Alkali’s API was based off of Django’s.

History

When learning new software I find it helpful to know where it’s coming from. Although most software is the worst software ever written, maybe, just maybe, it’s not so bad if you know how it got to where it is.

Jrnl

There’s a great app called jrnl, it’s a journal writing wrapper around your editor. One of jrnl’s killer features is that it writes out plain text files, that means you can edit your journal file with your editor of choice and use all those great Unix command line tools on it.

Jrnl is so great I use it at work and at home. However, it’s really annoying having two journals, it seems like the good info was always in the other one. We have a pretty draconian firewall at work so that means no Dropbox and no ssh. Thankfully, POSTs still work.

Long story short, I wanted jrnl to be able to sync its entries with a website. So I started hacking away and before you know it I was completely rewriting jrnl. But jrnl’s main data structure was a list of Entry objects. This worked, but was a bit cumbersome. It was very cumbersome when trying to sync with a remote server.

So I decided that jrnl really needed to be a wrapper around a database. So I started looking at some different Python databases and a few looked promising, but after playing with them I found them all to be lacking in some fashion or another.

And that’s how alkali was born.

PS. my version of jrnl will hopefully be released not too long after alkali.

Django

I’ve used Django in the past and found its ORM (Object Relational Mapper) to be easy and intuitive. So I decided that I’d write a light weight database using the same syntax as Django.

This worked surprisingly well. If you’re ever in doubt about alkali, go read the fantastic Django docs and they’ll probably point you in the correct direction. The two relevant sections are about models and queries.

Goals

Since my ultimate goal was a backwards compatible jrnl with webserver syncing, everything in alkali had to support that.

So here is the list of non-negotiable features that alkali had to support:

  • write data files in plain text (be compatible with existing journals)

Yep, that’s basically it. Other features like searching are implied.

Alkali needed to allow a developer to control the on-disk format of its data. And that was easy to accomplish, just inherit from alkali.storage.Storage and override alkali.storage.IStorage.read() and alkali.storage.IStorage.write().

This simple pattern was so effective that I now have a REST storage class and that’s how jrnl now syncs with a webserver.

Quickstart

If you’ve read our History then you know about jrnl.

Let’s use jrnl to show how to use alkali.

Entry

from __future__ import unicode_literals, absolute_import
from alkali import Model, fields, tznow

from .storage import JournalStorage

class Entry(Model):
    class Meta:
        storage = JournalStorage

    date  = fields.DateTimeField(primary_key = True)
    title = fields.StringField()
    body  = fields.StringField()

Lets break this down a bit.

Model

class Entry(Model):

Inherit from alkali.model.Model to make a new model. A Model class the equivalent to a database table schema. A Model instance is the equivalent to a row in that table.

Meta

class Meta:
    storage = JournalStorage

A Meta class is optional but it is handy to specify behaviour of the given model. In this case we’re using JournalStorage to save this model.

Known Meta properties are:

  • storage: specify the storage class that reads/writes to persistent storage. This value overrides the database default.
  • filename: specify the actual file to read/write to. If omitted, the filename will default to <model name>.<storage.extension>. The Database can override this of course.

Fields

date  = fields.DateTimeField(primary_key = True)
title = fields.StringField()
body  = fields.StringField()

At the class level, define some variables of type alkali.fields.IField.

Feel free to make your own if the few that come with alkali are insufficient. It would not be hard to make more complicated fields like a BitmapField, all one would have to do is override alkali.fields.IField.dumps() and alkali.fields.IField.loads().

alkali ships with the following fields:

Fields can take a keyword primary_key. Unlike Django, alkali doesn’t automatically create an id field that is the primary key, you must specify your own. Not only that, you can specify multiple fields as primary_key to create a compound primary key.

Journal

The main parent object in alkali is the alkali.database.Database. You’ll probably want to encapsulate the Database in something, in jrnls case that would be the Journal class.

from alkali import Database

class Journal(object):
    def __init__(self, filename=None, save_on_exit=True ):

        # set the filename in Meta so future Storage calls have a
        # file to work with
        Entry.Meta.filename = filename

        self._db = Database( models=[Entry], save_on_exit=save_on_exit )
        self._db.load()

Lets break this down a bit.

Database

self._db = Database( models=[Entry], save_on_exit=save_on_exit )
self._db.load()

Create a Database object. The only required parameter is models, a list of Model classes that comprise the database.

save_on_exit tells the database to save all its data when it goes out of scope. This means the developer doesn’t have to explicitly call alkali.database.Database.store().

Meta

Entry.Meta.filename = filename

You can set the Model.Meta.filename at definition time or set it later at runtime.

By Our Powers Combined

So lets make an entry and save it to the database.

from alkali import Database, tznow

db = Database(models=[Entry], save_on_exit=True)

e = Entry(date=tznow(), title="my first entry", body="alkali is pretty good")
e.save()    # adds model instance to Entry.objects

db.store()  # saved to ./Entry.json because those are the defaults

e = Entry.objects.get( title="my first entry" )
e.body = "alkali is the bestest"

# updated entry will be saved when database goes out of scope
# because save_on_exit is True

alkali package

alkali.database module

from alkali import Database, JSONStorage, Model, fields

class MyModel( Model ):
    id = fields.IntField(primary_key=True)
    title = fields.StringField()

db = Database(models=[MyModel], storage=JSONStorage, root_dir='/tmp', save_on_exit=True)

m = MyModel(id=1,title='old title')
m.save()                      # adds model instance to MyModel.objects
db.store()                    # creates /tmp/MyModel.json

db.load()                     # read /tmp/MyModel.json
m = MyModel.objects.get(pk=1) # do a search on primary key
m.title = "my new title"      # change the title
# don't need to call m.save() since the database "knows" about m
# db.store() is automatically called as db goes out of scope, save_on_exit==True
class alkali.database.Database(models=[], **kw)[source]

Bases: object

This is the parent object that owns and coordinates all the different classes and objects defined in this module.

Variables:
  • _storage_type – default storage type for all models, defaults to alkali.storage.JSONStorage
  • _root_dir – directory where all models are stored, defaults to current working directory
  • _save_on_exit – automatically save all models before Database object is destroyed. call Database.store() explicitly if _save_on_exit is false.
Parameters:
  • models – a list of alkali.model.Model classes
  • kw
    • root_dir: default save path directory
    • save_on_exit: save all models to disk on exit
    • storage: default storage class for all models
models

property: return list of models in the database

get_model(model_name)[source]
Parameters:model_name – the name of the model, note: all model names are converted to lowercase
Return type:alkali.model.Model
get_filename(model, storage=None)[source]

get the filename for the specified model. allow models to specify their own filename or generate one based on storage class. prepend Database._root_dir.

eg. <_root_dir>/<model name>.<storage.extension>

Parameters:
  • model – the model name or model class
  • storage – the storage class, uses the database default if None
Returns:

returns a filename path

Return type:

str

set_storage(model, storage=None)[source]

set the storage instance for the specified model

precedence is:

  1. passed in storage class
  2. model defined storage class
  3. default storage class of database (JSONStorage)
Parameters:
  • model – the model name or model class
  • storage (IStorage) – override model storage class
Return type:

alkali.storage.Storage instance or None

get_storage(model)[source]

get the storage instance for the specified model

Parameters:model – the model name or model class
Return type:alkali.storage.Storage instance or None
store(force=False)[source]

persistantly store all model data

Parameters:force (bool) – force store even if alkali.manager.Manager thinks data is clean
load()[source]

load all model data from disk

Parameters:storage (IStorage) – override model storage class

alkali.fields module

class alkali.fields.Field(field_type, **kw)[source]

Bases: object

Base class for all field types. it tries to hold all the functionality so derived classes only need to override methods in special circumstances.

Field objects are instantiated during model creation. i = IntField()

All Model instances share the same instantiated Field objects in their Meta class. ie: id(MyModel().Meta.fields['i']) == id(MyModel().Meta.fields['i'])

Fields are python descriptors (@property is also a descriptor). So when a field is get/set the actual value is stored in the parent model instance.

The actual Field() object is accessable via model().Meta.fields[field_name] or via dynamic lookup of <field_name>__field. eg. m.email__field.

Parameters:
  • field_type (str/int/float/etc) – the type this field should hold
  • kw
    • primary_key: is this field a primary key of parent model
    • indexed: is this field indexed (not implemented yet)
field_type

property: return type of this field (int, str, etc)

properties

property: return list of possible Field properties

default_value

property: what value does this Field default to during model instantiation

cast(value)[source]

Whenever a field value is set, the given value passes through this (or derived class) function. This allows validation plus helpful conversion.

::
int_field = “1” # converts to int(“1”) date_field = “Jan 1 2017” # coverted to datetime()
dumps(value)[source]

called during json serialization, if json module is unable to deal with given Field.field_type, convert to a known type here.

loads(value)[source]

called during json serialization, if json module is unable to deal with given Field.field_type, convert to a known type here.

class alkali.fields.IntField(**kw)[source]

Bases: alkali.fields.Field

default_value

IntField implements auto_increment, useful for a primary_key. The value is incremented during model instantiation.

class alkali.fields.BoolField(**kw)[source]

Bases: alkali.fields.Field

cast(value)[source]

Whenever a field value is set, the given value passes through this (or derived class) function. This allows validation plus helpful conversion.

::
int_field = “1” # converts to int(“1”) date_field = “Jan 1 2017” # coverted to datetime()
class alkali.fields.FloatField(**kw)[source]

Bases: alkali.fields.Field

class alkali.fields.StringField(**kw)[source]

Bases: alkali.fields.Field

holds a unicode string

cast(value)[source]

Whenever a field value is set, the given value passes through this (or derived class) function. This allows validation plus helpful conversion.

::
int_field = “1” # converts to int(“1”) date_field = “Jan 1 2017” # coverted to datetime()
class alkali.fields.DateTimeField(**kw)[source]

Bases: alkali.fields.Field

cast(value)[source]

make sure date always has a time zone

dumps(value)[source]

called during json serialization, if json module is unable to deal with given Field.field_type, convert to a known type here.

loads(value)[source]

called during json serialization, if json module is unable to deal with given Field.field_type, convert to a known type here.

class alkali.fields.SetField(**kw)[source]

Bases: alkali.fields.Field

class alkali.fields.ForeignKey(foreign_model, **kw)[source]

Bases: alkali.fields.Field

A ForeignKey is a special type of field. It stores the same value as a primary key in another field. When the model gets/sets a ForeignKey the appropriate lookup is done in the remote manager to return the remote instance.

Parameters:
  • foreign_model (alkali.model.Model) – the Model that this field is referencing
  • kw
    • primary_key: is this field a primary key of parent model
pk_field
Return type:IField.field_type(), eg: IntField
lookup(pk)[source]

given a pk, return foreign_model instance

cast(value)[source]

return the primary_key value of the foreign model

dumps(value)[source]

called during json serialization, if json module is unable to deal with given Field.field_type, convert to a known type here.

class alkali.fields.OneToOneField(foreign_model, **kw)[source]

Bases: alkali.fields.ForeignKey

Parameters:
  • foreign_model (alkali.model.Model) – the Model that this field is referencing
  • kw
    • primary_key: is this field a primary key of parent model

alkali.manager module

class alkali.manager.Manager(model_class)[source]

Bases: object

the Manager class is the parent/owner of all the alkali.model.Model instances. Each Model has it’s own manager. Manager could rightly be called Table.

Parameters:model_class (Model) – the model that we should store (not an instance)
model_class
count

property: number of model instances we’re holding

pks

property: return all primary keys

Return type:list
instances

property: return all model instances

Return type:list
dirty

property: return True if any model instances are dirty

Return type:bool
static sorter(elements, reverse=False)[source]

yield model instances in primary key order

Parameters:
  • elements (Manager.instances) – our instances
  • kw
    • reverse: return in reverse order
Return type:

generator

save(instance, dirty=True, copy_instance=True)[source]

Copy instance into our collection. We make a copy so that caller can’t change its object and affect our version without calling save() again.

Parameters:
  • instance (Model) –
  • dirty – don’t mark us as dirty if False, used during loading
clear()[source]

remove all instances of our models. we’ll be marked as dirty if we previously had model instances.

Note: this does not affect on-disk files until Manager.save() is called.

delete(instance)[source]

remove an instance from our models by calling del on it

Parameters:instance (Model) –
cb_delete_foreign(sender, instance)[source]

called when our foreign parent is about to be deleted

cb_create_foreign(sender, instance)[source]

called when our foreign parent (likely OneToOneField) is created

store(storage, force=False)[source]

save all our instances to storage

Parameters:
  • storage (Storage) – an instance
  • force (bool) – force save even if we’re not dirty
load(storage)[source]

load all our instances from storage

Parameters:storage (Storage) – an instance
Raises:KeyError – if there are duplicate primary keys
get(*pk, **kw)[source]

perform a query that returns a single instance of a model

Parameters:
  • pk (value or tuple if multi-pk) – optional primary key
  • kw – optional field_name=value
Return type:

single alkali.model.Model instance

Raises:
  • DoesNotExist – if 0 instances returned
  • MultipleObjectsReturned – if more than 1 instance returned
m = MyModel.objects.get(1)      # equiv to
m = MyModel.objects.get(pk=1)

m = MyModel.objects.get(some_field='a unique value')
m = MyModel.objects.get(field1='a unique', field2='value')

alkali.metamodel module

class alkali.metamodel.MetaModel[source]

Bases: type

do not use this class directly

code reviews of this class are very welcome

base class for alkali.model.Model.

this complicated metaclass is required to convert a stylized class into a useful concrete one. it converts alkali.fields.Field variables into their base types as attributes on the instantiated class.

Meta: adds a Meta class if not already defined in Model derived class

objects: alkali.manager.Manager

alkali.model module

exception alkali.model.ObjectDoesNotExist[source]

Bases: Exception

base class for a model specific exception (eg. MyModel.DoesNotExist) raised when a query yields no results

class alkali.model.Model(*args, **kw)[source]

Bases: object

main class for the database.

the definition of this class defines a table schema but instances of this class hold a row.

model fields are available as attributes. eg. m.my_field = 'foo'

the Django docs at https://docs.djangoproject.com/en/1.10/topics/db/models/ will be fairly relevant to alkali

see alkali.database for some example code

set_field(field, value)[source]

set a field value, this method is automatically called when setting a field value. safe to call externally.

fires alkali.signals.field_update for any listeners

Parameters:
  • field (alkali.fields.Field) – instance of Field
  • value (Field.field_type) – the already-cast value to store
dirty

property: return True if our fields have changed since creation

schema

property: a string that quickly shows the fields and types

pk

property: returns this models primary key value. If the model is comprised of several primary keys then return a tuple of them.

Return type:Field.field_type or tuple-of-Field.field_type
valid_pk
dict

property: returns a dict of all the fields, the fields are json consumable

Return type:OrderedDict
json

property: returns json that holds all the fields

Return type:str
save()[source]

add ourselves to our alkali.manager.Manager and mark ourselves as no longer dirty.

it’s up to our Manager to persistently save us

alkali.peekorator module

from https://gist.github.com/dmckeone/7518335, slightly modified by Kurt Neufeld

Generic Peekorator, modeled after next(), for “looking into the future” of a generator/iterator.

Acknowledgements:

class alkali.peekorator.PeekoratorDefault[source]

Bases: object

alkali.peekorator.peek(peekorator, n=0, default=<alkali.peekorator.PeekoratorDefault object>)[source]

next()-like function to be used with a Peekorator

Parameters:
  • peekorator – Peekorator to use
  • n (int) – Number of items to look ahead
  • default – If the iterator is exhausted then a default is given, raise StopIteration if not given
class alkali.peekorator.Peekorator(generator)[source]

Bases: object

Wrap a generator (or iterator) and allow the ability to peek at the next element in a lazy fashion. If the user never uses peek(), then the only cost over a regular generator is the proxied function call.

Parameters:generator – a generator or iterator that will be iterated over
peek(n=0)

Return the peeked element for the generator

Parameters:n (int) – how many iterations into the future to peek
next()

Get the next result from the generator

is_first()[source]

if you just got the first element then return True

Return type:bool
is_last()[source]

if you’re about to get the last element then return True

Return type:bool

alkali.query module

from alkali import Database, Model

class MyModel( Model ):
    id = fields.IntField(primary_key=True)
    title = fields.StringField()

db = Database( models=[MyModel] )

# create 10 instances and save them
for i in range(10):
    MyModel(id=i, title='number %d' % i).save()

assert MyModel.objects.count == 10
assert MyModel.objects.filter(id__gt=5).count == 4
assert MyModel.objects.filter(id__gt=5, id__le=7).count == 2
assert MyModel.objects.get(pk=1).title == 'number 1'
assert MyModel.objects.order_by('id')[0].id == 0
assert MyModel.objects.order_by('-id')[0].id == 9
class alkali.query.Aggregate(field)[source]

Bases: object

A reducing function that returns a single value

Parameters:str (field) –
class alkali.query.Count(field)[source]

Bases: alkali.query.Aggregate

number of objects in query

Parameters:str (field) –
class alkali.query.Sum(field)[source]

Bases: alkali.query.Aggregate

sum of given field (numeric field required)

Parameters:str (field) –
class alkali.query.Max(field)[source]

Bases: alkali.query.Aggregate

largest field (numeric field required)

Parameters:str (field) –
class alkali.query.Min(field)[source]

Bases: alkali.query.Aggregate

smallest field (numeric field required)

Parameters:str (field) –
alkali.query.as_list(func)[source]
class alkali.query.Query(manager)[source]

Bases: object

this class performs queries on manager instances returns lists of model instances

this class is one of the main reasons to use alkali

the Django docs at https://docs.djangoproject.com/en/1.10/topics/db/queries/ will be fairly relevant to alkali, except for anything related to foreign or many2many fields.

this is an internal class so you shouldn’t have to create it directly. create via Manager. MyModel.objects

Parameters:manager (Manager) –
count

property: number of model instances we are currently tracking

fields

property: helper function to get dict of model fields

Return type:dict
model_class

property: return our managers model class

field_names

property: return our model field names

Return type:list of str
all()[source]
filter(**kw)[source]
Parameters:kwfield_name__op=value, note: field_name can be a property
Return type:Query

perform a query, keeping model instances that pass the criteria specified in the kw parameter.

see example code above. see Django page for very thorough docs on this functionality. basically, its field_name ‘__’ operation = value.

# field/property f is 'foo' or 'bar'
MyModel.objects.filter( f__in=['foo','bar'] )

# 'foo' is in field/property myset
MyModel.objects.filter( myset__rin='foo' )
order_by(*fields)[source]

change order of self.instances

Parameters:fields (str) – field names, prefixed with optional ‘-‘ to indicate reverse order
Return type:Query

warning: because this isn’t a real database and we don’t have grouping, passing in multiple fields will very possibly sort on the last field only. python sorting is stable however, so a multiple field sort may work as intended.

group_by(field)[source]

returns a dict of distinct values and Query objects

Parameters:field – field name
Return type:dict
MyModel.objects.group_by('str_type')

{ 's1': <Query MyModel(1), MyModel(3)>
  's2': <Query MyModel(2)> }
limit(**kw)[source]
first()[source]

return first object from query, depends on ordering raise if query is empty

values(**kw)[source]
values_list(*fields, **kw)[source]

returns nested list of values in given fields order

if flat=True, returns single list

Parameters:
  • fields (str) – field names or all fields if empty
  • kw (bool) – flat
Rtye:

list

see alkali.query.Query.annotate() for example

exists()[source]

does the current query hold any elements

Return type:int
aggregate(*args, **kw)[source]

Aggregate (aka reduce) the query via the given function. Each callable Aggregate object takes a field/property name as a parameter.

The returned dictionary has key <field_name>__<agg function> unless keyword is given.

Parameters:
  • args (Aggregate) – Count Sum Max Min
  • kwkey_value=Aggregate, note: field_name can be a property
Return type:

dict

MyModel.objects.aggregate( the_count=Count('id'), Sum('size') )
# { 'the_count': 12, 'size__sum': 24957 }
annotate(**kw)[source]

add a variable to each model instance currently in the query

each element in query is passed into function.

Parameters:kw – variable=function
Return type:Query
c = itertools.count()
Counter = lambda elem, c=c: c.next()
MyModel(int_type=10).save()
MyModel(int_type=11).save()
MyModel.objects.annotate(counter=Counter).values_list('int_type','counter')
# [[10, 1], [11, 2]]
MyModel.objects.annotate(counter=Counter).values_list('int_type','counter')
# [[10, 3], [11, 4]]
distinct(*fields)[source]

returns a list of lists, each sub-list contains distinct values of the given field

Parameters:fields (str) – field names
Return type:list
MyModel(int_type=10, str_type='hi').save()
MyModel(int_type=11, str_type='hi').save()
MyModel(int_type=12, str_type='there').save()

MyModel.objects.distinct('int_type','str_type')
# [[10, 11, 12], [u'there', u'hi']]

MyModel.objects.distinct('str_type')
# [[u'there', u'hi']]

alkali.relmanager module

class alkali.relmanager.RelManager(foreign, child_class, child_field)[source]

Bases: object

This is an internal class that a user of alkali unlikely to use directly.

The RelManager class manages queries/connections between two models that have a alkali.fields.ForeignKey (or equivalent) field.

Parameters:
  • foreign (Model) – instance of the model that is pointed at
  • child_class (Model) – the model class that contains the ForeignKey
  • child_field (str) – the field name that points to ForeignKey
foreign
child_class
child_field
count
add(child)[source]
create(**kw)[source]
all()[source]

get all objects that point to this instance see alkali.manager.Manager.all() for syntax

Return type:alkali.query.Query
get(**kw)[source]

get a single object that refers to this instance see alkali.manager.Manager.get() for syntax

Return type:alkali.model.Model

alkali.signals module

alkali.storage module

exception alkali.storage.FileAlreadyLocked[source]

Bases: Exception

the exception that is thrown when a storage instance tries and fails to lock its data file

class alkali.storage.Storage(*args, **kw)[source]

Bases: object

helper base class for the Storage object hierarchy

class alkali.storage.FileStorage(filename=None, *args, **kw)[source]

Bases: alkali.storage.Storage

this helper class determines the on-disk representation of the database. it could write out objects as json or plain txt or binary, that’s up to the implementation and should be transparent to any models/database.

extension = 'raw'
filename
lock()[source]
unlock()[source]
read(model_class)[source]

helper function that just reads a file

write(iterator)[source]
class alkali.storage.JSONStorage(filename=None, *args, **kw)[source]

Bases: alkali.storage.FileStorage

save models in json format

extension = 'json'
read(model_class)[source]

helper function that just reads a file

write(iterator)[source]
class alkali.storage.CSVStorage(filename=None, *args, **kw)[source]

Bases: alkali.storage.FileStorage

load models in csv format

first line assumed to be column headers (aka: field names)

use remap_fieldnames to change column headers into model field names

extension = 'csv'
read(model_class)[source]

helper function that just reads a file

remap_fieldnames(model_class, row)[source]

example of remap_fieldnames that could be defined in derived class or as a stand-alone function.

warning: make sure your header row that contains field names has no spaces in it

def remap_fieldnames(self, model_class, row):
    fields = model_class.Meta.fields.keys()

    for k in row.keys():
        results_key = k.lower().replace(' ', '_')

        if results_key not in fields:
            if k == 'Some Wierd Name':
                results_key = 'good_name'
            else:
                raise RuntimeError( "unknown field: {}".format(k) )

        row[results_key] = row.pop(k)

    return row
write(iterator)[source]

warning: if remap_fieldnames changes names then saved file will have a different header line than original file

Indices and tables