Interfaces of the plugins for OpenData import

This article describes the basic interfaces of the plugins importing open data.


Job is the the sequence of actions for loading an array of open data through the http protocol, including their processing and loading into Geo2Tag service’s DB. 

Import is transfer data from an external DB into DB of a Geo2Tag service.

Job status is a set of parameters describing current status of a job: time spent, source of import, the flag of completion of the job, other import options.

REST interfaces

  • /<instance_prefix>/plugin/<plugin_name>/service/<serviceName>/job GET – list of the jobs of import and its statuses.
  • /<instance_prefix>/plugin/<plugin_name>/service/<serviceName>/job POST – add a job of import, parameters:
    • channelName, – the name of the channel where importing data will be performed; 
    • openDataUrl,  – link to downloading the set of data;
  • /<instance_prefix>/plugin/<plugin_name>/service/<serviceName>/job/<job_id> GET – the status of the job.
  • /<instance_prefix>/plugin/<plugin_name>/service/<serviceName>/job/<job_id> DELETE – stop the job.

Software interfaces



  • @should_be_extended_in_descendents – method/the property should be extended/override in the derived class
  • @abstract_method – an abstract method which should be defined in the derived class

Job (abstract class)

class Job():

    def __init__(
        self.thread = None
        self._id = self.generateId()
        self.startTime =
        self.done = False
        self.timeElapsed = None
        self.backgroundFunction = backgroundFunction
        self.channelName = channelName
        self.openDataUrl = openDataUrl
        self.importDataDict = importDataDict
        self.serviceName = serviceName

    def internalStart(self):
    def internalStop(self):

    def start(self):
        self.startTime =

    def stop(self):
        self.done = True
        self.timeElapsed = - self.startTime

    def getTimeStatistics(self):
        if self.timeElapsed is None:
            return - self.startTime
        return self.timeElapsed

    def describe(self):
        return {
            '_id': self._id,
            'time': str(
            'done': self.done,
            'channelName': self.channelName,
            'openDataUrl': self.openDataUrl,
            'serviceName': self.serviceName}

    def generateId(cls):
        return ''.join(
                string.ascii_uppercase +
                string.ascii_lowercase +
                string.digits) for x in range(12))

OpenDataObjectsLoader (abstract class)

class OpenDataObjectsLoader:
    def __init__(self, loadUrl):
        self.loadUrl = loadUrl
    def load(self):     


class OpenDataToPointTranslator:
    def __init__(
            self, importDataDict,
        self.objectRepresentation = objectRepresentation
        self.version = version
        self.importSource = importSource
        self.channelId = channelId
        self.importDataDict = importDataDict
    def getPointJson(self):
        obj = {}
        obj['version'] = self.version
        obj['import_source'] = self.importSource
        return obj
    def getPoint(self):
        point = {'json': self.getPointJson()}
        point['channel_id'] = self.channelId
        return point

OpenDataObjectsParser (abstract class)

class OpenDataObjectsParser:
    def __init__(self, data): = data
    def parse(self):


class OpenKareliaDataToPointsLoader:
    pointsArray = []

    def __init__(self, serviceName, points):
        self.pointsArray = points
        self.serviceName = serviceName
    def loadPoints(self):
        collection = getDbObject(self.serviceName)[POINTS]
        for point in self.pointsArray:

JobManager – class for managing jobs, starts and stops jobs, displays information about their status.

class JobManager:
    jobs = {}
    def startJob(cls, job):
        jobId = job.describe().get('_id', '')
        job.start()[jobId] = job
        return jobId
    def getJob(cls, jobId):
    def stopJob(cls, jobId):
    def getJobs(cls):
        result = []
        for job in
        return result


class JobResource(Resource):
    def get(self, serviceName, jobId):
        return JobManager.getJob(jobId)
    def delete(self, serviceName, jobId):
        return JobManager.stopJob(jobId)


class ODImportParser():  
    def parse(self):
        args = loads(request.get_data())
        return args
    def validate(self, args):    
       for key in MANDATORY_FIELDS:
            if key not in args:
                raise BadRequest('{0} parameter is missing'.format(key))
            elif not isinstance(args[key], unicode):
                raise BadRequest('{0} value is not unicode'.format(key)) 
    def parsePostParameters():
        args = self.parse()
        return args

JobListResourceFactory – factory method to create JobListResource classes which we need.

def JobListResourceFactory(parserClass, jobClass, importFunction)
    class JobListResource(Resource):
        def get(self, serviceName):
            return JobManager.getJobs()
        def post(self, serviceName):
            importDataDict = parserClass.parsePostParameters()
            channelName = importDataDict.get('channelName')
            getChannelByName(serviceName, channelName)
            job = jobClass(importFunction, importDataDict.get('channelName'),
                           importDataDict.get('openDataUrl'), importDataDict,
        return JobManager.startJob(job)
    return JobListResource

Method performImportActions

performImportActions ( odLoaderClass, odParserClass, odToPointTranslatorClass, odToPointsLoaderClass, serviceName, channelName, openDataUrl, showObjectUrl, showImageUrl)

A function which contains the logic of importing from source of open data.

def performImportActions ( odLoaderClass, odParserClass, \
    odToPointTranslatorClass, odToPointsLoaderClass, \
    serviceName, channelName, openDataUrl, \
#showObjectUrl, showImageUrl
    channelId = getChannelIdByName(channelName)
    version =
    loader = odLoaderClass(openDataUrl)
    openData = loader.load()
    parser = odParserClass(openData)
    objects = parser.parse()
    points = [ ]
    for object in objects:
        translator = odToPointTranslatorClass(importDataDict, object, version,openDataUrl, channelId)
    pointsLoader = odToPointsLoaderClass(serviceName, points)



ShareShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on VKEmail this to someone

How to create your own OpenData import plugin


This article is a manual on creating your own OpenData import plugins for Geo2Tag platform. Detailed information about the basic classes and common interfaces of these plugins could be found in article Interfaces of the plugins for OpenData import.

Preliminary preparation

Before you create your plugin for OpenData import you should to have the answers to the next questions: 

  1. What kind of  access to the open data you have:
    • access protocol (HTTP(S), FTP, WebDav ..)
    • the presence of encryption
    • the presence of authorization
  2. How does user receive the open data:  in form of array or element-by-element?
  3. In which format does the user receive data (XML/JSON/CSV/ other)?
  4. Is it possible to get access to element of data by the external link?
  5. Do the elements of data contain images and if so, how can you get an external link?
  6. What kind of information about location do elements of data contain:
    • latitude, longitude and altitude
    • postal address
    • zip code

Creation the basic classes

The minimal realization OpenData import plugins contains inheritance/extension of the next classes: 

  1. Job – the abstract class for the import.
  2. OpenDataObjectsLoader – the basic class for downloading elements of open data from external sources.
  3. OpenDataToPointTranslator – the abstract class for conversion elements of open data to the format of points Geo2Tag.  
  4. OpenDataObjectsParser – the abstract class for split the array of open data into single elements.
  5. OpenDataToPointsLoader –  the abstract class for points record to Geo2Tag DB.
  6. ODImportParser – parser of arguments for REST queries to the plugin.

To create ordinary REST queries you have to use:

  1. Resource class for manage a separate job of import JobResource.
  2. Factory method JobListResourceFactory for creating resource class in order to add and view all jobs of import.

For the overall sequence import actions you can use the existing function performImportActions: the passed parameters are the derived classes from the classes specified above. Also it’s possible to create your own implementation of the import function. In that case the main requirement will be a correspondence between the implementation your own class derived from Job and the prototype of the import function.

Adding a plugin

It’s necessary to formalize and connect created set of sources files according to the article Format and connection of plugins.

ShareShare on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on VKEmail this to someone