Recently, I have tried to optimize a python module which accepts json string from redis-server and update-insert to Mongodb. The performance of pymongo is terrible, and because of the GIL, I even can’t improve the performance through multi thread. So, I try to figure out by use C language.
The Mongodb development library of ‘C’ is mongo-c-driver. Install it.
In ‘mongo-c-driver’, the main data structure is not ‘json’, It uses ‘Bson’(libbson) to instead. If you want to study libbson through the doc, you could use the command ‘yelp’ to open it(If you are using a Linux with GUI) or covert the Mallard to Html.
Everything ready, I write a simple program bson.c to study libbson first. Focus to the Makefile, we should compile it with the ‘mongo-c-driver’.
This is the main program that I will use in my project. I compile it to a ‘.so’ file, then I can call the functions by ‘C’, ‘python’ or other program languages.
The key is using the ‘mongo_updater’ pointer as a ‘void’ pointer, in this way, I can call all the functions in python without defining any class to fit the ‘mongo_updater’ structure(I really no idear about it, how can I define a class to fit a structure that contain a mongc_client_t pointer and mongoc_collection_t pointer.).
##test.c and test.py
In ‘test.c’, I have to write a function to achieve ‘urlencode’. In ‘mongo-c-driver’, connect to Mongodb is use a url that contain user name, password, host and auth source. It’s url, so some special characters have to be handled by urlencode. I was connecting with Mongodb, then, I read some data from ‘test.data’ and converted to a Bson object, then update to Mongodb at last.
In test.py, The ‘ctypes’ module is necessary, then I load the ‘libupdate.so’ and redefine the functions by python, then call them. It’s easy.