CLI walkthrough
1. Getting started with the CLI
Create a new folder for the repository you'll be working on by running the following command:
mkdir devchallenge && cd devchallenge
While you can keep your job scripts anywhere, it's a good practice to store
state.json
andoutput.json
in atmp
folder. To do this, create a new directory calledtmp
within yourdevchallenge
folder:mkdir tmp
Since
state.json
andoutput.json
may contain sensitive configuration information and project data, it's important to never upload them to GitHub. To ensure that GitHub ignores these files, add thetmp
directory to your.gitignore
file:echo "tmp" >> .gitignore
(Optional) Use the
tree
command to check that your directory structure looks correct. Runningtree -a
in yourdevchallenge
folder should display a structure like this:devchallenge
├── .gitignore
└── tmp
├── state.json
└── output.json
Create a job file called
hello.js
and write the following code.console.log('Hello World!');
What is a job?
An OpenFn job is Javascript code which follows a particular set of conventions. Typically a job has one or more operations which perform a particular task (like pulling information from a database, creating a record, etc.) and return state for the next operation to use.What is console.log?
console.log
is a core JavaScript language function which lets us output messages to the terminal window.Run the job using the CLI
openfn hello.js -o tmp/output.json
View expected output
[CLI] ⚠ WARNING: No adaptor provided!
[CLI] ⚠ This job will probably fail. Pass an adaptor with the -a flag, eg:
openfn job.js -a common
[CLI] ✔ Compiled from hello.js
[R/T] ♦ Starting job job-1
[JOB] ℹ Hello World!
[R/T] ✔ Completed job job-1 in 1ms
[CLI] ✔ State written to tmp/output.json
[CLI] ✔ Finished in 17ms ✨
Note that our console.log
statement was printed as [JOB] Hello world!
. Using
the console like this is helpful for debugging and/or understanding what's
happening inside our steps.
2. Using adaptor helper functions
Adaptors are Javascript or Typescript (a strongly typed super-set of JavaScript) modules that provide OpenFn users with a set of helper functions for simplifying communication with a specific external system. Learn more about adaptors here: docs.openfn.org/adaptors
Basic usage:
Let’s use @openfn/language-http adaptor to fetch a list of forms from https://jsonplaceholder.typicode.com/
Tasks:
Create a file called
getPosts.js
and write the following codegetPosts.jsget('https://jsonplaceholder.typicode.com/posts');
fn(state => {
console.log(state.data[0]);
return state;
});Run the job by running
openfn getPosts.js -i -a http -o tmp/output.json
Use -a
to specify the adaptor; use -i
to auto-install the necessary adaptor
Run openfn help
to see the full list of CLI arguments.
Since it is our first time using the http
adaptor, we are installing the
adaptor using -i
argument
3. Expand to see expected CLI logs
[CLI] ✔ Installing packages...
[CLI] ✔ Installed @openfn/language-http@4.2.8
[CLI] ✔ Installation complete in 14.555s
[CLI] ✔ Compiled from getPosts.js
[R/T] ♦ Starting job job-1
GET request succeeded with 200 ✓
[JOB] ℹ {
userId: 1,
id: 1,
title: 'sunt aut facere repellat provident occaecati excepturi optio reprehenderit',
body: 'quia et suscipit\n' +
'suscipit recusandae consequuntur expedita et cum\n' +
'reprehenderit molestiae ut ut quas totam\n' +
'nostrum rerum est autem sunt rem eveniet architecto'
}
[R/T] ✔ Completed job job-1 in 872ms
[CLI] ✔ State written to tmp/output.json
[CLI] ✔ Finished in 15.518s ✨
The data displayed in this CLI logs is generated from a JSONPlaceholder API and does not represent real-world information. It is intended for testing and development purposes only.
For accurate testing, consider using real data from your API or service.
3. Understanding state
If a job expression is a set of instructions for a chef (a recipe?) then the initial state is all of the ingredients they need tied up in a perfect little bundle. See "It all starts with state" in the knowledge base for extra context.
It usually looks something like this
{
"configuration": {
"hostUrl": "https://moh.kenya.gov.ke/dhis2",
"username": "someone",
"password": "something-secret"
},
"data": {
"type": "registration",
"patient": {
"age": 24,
"gender": "M",
"nationalId": "321cs7"
}
}
}
state.configuration
This key is where we put credentials which are used to authorize connections to
any authenticated system that the job will interact with. (Note that this part
of state
is usually overwritten at runtime with a real "credential" when using
the OpenFn platform, rather than the CLI.)
Note that console.log(state)
will display the whole state, including
state.configuration
elements such as username and password. Remove this
log whenever you're done debugging to avoid accidentally exposing sensitive
information when the job is successfully deployed on production.
The OpenFn platform has built in protections to "scrub" state from the logs, but when you're using the CLI directly you're on your own!
state.data
This key is where we put data related to a specific job run. On the platform, it's the work-order-specific data from a triggering HTTP request or some bit of information that's passed from one job to another.
Using CLI, state.json
will be loaded automatically from the current directory
Or you can specify the path to the state file by passing the option -s, --state-path
Specify a path to your state.json
file with this command:
openfn hello.js -a http -s tmp/state.json -o tmp/output.json
Expand to see expected CLI logs
[CLI] ✔ Compiled job from hello.js
GET request succeeded with 200 ✓
[R/T] ✔ Operation 1 complete in 876ms
[R/T] ✔ Operation 2 complete in 0ms
[CLI] ✔ Writing output to tmp/output.json
[CLI] ✔ Done in 1.222s! ✨
How can we use state?
Each adaptor has a configuration schema that's recommended for use in your
state.json
. Here is an example
of how to set up state.configuration
for language-http
.
{
"username": "name@email",
"password": "supersecret",
"baseUrl": "https://jsonplaceholder.typicode.com"
}
Tasks:
Update your
state.json
to look like this:Expand to see state.json
state.json{
"configuration": {
"baseUrl": "https://jsonplaceholder.typicode.com"
}
}
Since we have update our configuration in our state.json
we can now use
get()
helper function without the need to specify the baseUrl—i.e
get('posts')
Update your
getPosts.js
job to look like this:Expand to see getPosts.js
getPosts.js// Get all posts
get('posts');
fn(state => {
const posts = state.data;
console.log(posts[0]);
return state;
});Now run the job using the following command
openfn getPosts.js -a http -s tmp/state.json -o tmp/output.json
And validate that you see the expected CLI logs:
[CLI] ✔ Compiled job from getPosts.js
GET request succeeded with 200 ✓
[R/T] ✔ Operation 1 complete in 120ms
[JOB] ℹ {
userId: 1,
id: 1,
title: 'sunt aut facere repellat provident occaecati excepturi optio reprehenderit',
body: 'quia et suscipit\n' +
'suscipit recusandae consequuntur expedita et cum\n' +
'reprehenderit molestiae ut ut quas totam\n' +
'nostrum rerum est autem sunt rem eveniet architecto'
}
[R/T] ✔ Operation 2 complete in 0ms
[CLI] ✔ Writing output to tmp/output.json
[CLI] ✔ Done in 470ms! ✨
4. Clean & Transform Data
In most cases you need to manipulate, clean, or transform data at some step in
your workflow. For example after we get data from the
https://jsonplaceholder.typicode.com
registry we might need to group the posts
by user id. The example below shows how we can:
- get all posts and return them in
state.data
- group returned posts by
userId
- log posts with userId
1
Expand to see example:
// Get all posts
get('posts');
// Group posts by user id
fn(state => {
const posts = state.data;
// Group posts by userId
const groupPostsByUserId = posts.reduce((acc, post) => {
const existingValue = acc[post.userId] || [];
return {
...acc,
[post.userId]: [...existingValue, post],
};
}, {});
console.log(groupPostsByUserId);
return { ...state, groupPostsByUserId };
});
// Log posts where userId = 1
fn(state => {
const { groupPostsByUserId } = state;
console.log('Post with userId 1', groupPostsByUserId[1]);
return state;
});
What is array.reduce
?
reduce()
method applies a function against an accumulator and each value of the array (from left-to-right) to reduce it to a single value.Perhaps the easiest-to-understand case for reduce()
is to return
the sum of all the elements in an array:
JavaScript Demo: Array.reduce()
// 0 + 1 + 2 + 3 + 4
const array1 = [1, 2, 3, 4];
const initialValue = 0;
const sumWithInitial = array1.reduce(
(accumulator, currentValue) => accumulator + currentValue,
initialValue
);
console.log(sumWithInitial); // Expected output: 10
You can learn more about array.reduce
from
this article
Expand to see expected CLI logs
[CLI] ✔ Compiled job from getPosts.js
GET request succeeded with 200 ✓
[R/T] ✔ Operation 1 complete in 825ms
[R/T] ✔ Operation 2 complete in 0ms
[JOB] ℹ Post with userId 1 [ //All of posts for userId 1 ]
[R/T] ✔ Operation 3 complete in 12ms
[CLI] ✔ Writing output to tmp/output.json
[CLI] ✔ Done in 1.239s! ✨
5. Debugging errors
When debugging, it’s interesting and helpful to use console.log to have a visual representation of the content of the manipulated objects (such as state).
When you want to inspect the content of state in between operations, add an
fn()
block with a console.log
:
// firstOperation(...);
fn(state => {
console.log(state);
return state;
});
// secondOperation(...);
Create debug.js and paste the code below
Expand to see debug.js
// Get all posts
get('posts');
// Get post by index helper function
fn(state => {
// const getPostbyIndex = (index) => dataValue(index)(state);
console.log(dataValue(1));
return { ...state };
});
Run openfn debug.js -a http -s tmp/state.json
Expected CLI logs
[CLI] ✘ TypeError: path.match is not a function
at dataPath (/tmp/openfn/repo/node_modules/@openfn/language-common/dist/index.cjs:258:26)
at dataValue (/tmp/openfn/repo/node_modules/@openfn/language-common/dist/index.cjs:262:22)
at getPostbyIndex (vm:module(0):5:37)
at vm:module(0):18:36
at /tmp/openfn/repo/node_modules/@openfn/language-common/dist/index.cjs:241:12
at file:///home/openfn/.asdf/installs/nodejs/18.12.0/lib/node_modules/@openfn/cli/node_modules/@openfn/runtime/dist/index.js:288:26
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async run (file:///home/openfn/.asdf/installs/nodejs/18.12.0/lib/node_modules/@openfn/cli/node_modules/@openfn/runtime/dist/index.js:269:18)
at async executeHandler (file:///home/openfn/.asdf/installs/nodejs/18.12.0/lib/node_modules/@openfn/cli/dist/process/runner.js:388:20)
As you can see from our logs that helper function dataValue
has a TypeError,
to troubleshoot this you can go to the documentation for dataValue ->
docs.openfn.org/adaptors/packages/common-docs/#datavaluepath--operation
According to the docs, dataValue takes a path as input, which is of the string
type. But in our operation we were passing an integer, that’s why we have a
TypeError. You can fix the error by passing a string in dataValue i.e
console.log(dataValue(“1”))
Expected CLI logs
[CLI] ✔ Compiled job from debug.js
GET request succeeded with 200 ✓
[R/T] ✔ Operation 1 complete in 722ms
[JOB] ℹ [Function (anonymous)]
[R/T] ✔ Operation 2 complete in 1ms
[CLI] ✔ Writing output to tmp/output.json
[CLI] ✔ Done in 1.102s ✨
If you need more information for debugging you can pass -l debug
. This sets
the log level to debug, which logs all information about the run.
i.e openfn debug.js -a http -l debug
6. Each and array iteration
We often have to perform the same operation multiple times for items in an array. Most of the helper functions for data manipulation are inherited from @openfn/language-common and are available in most of the adaptors.
Modify getPosts.js to group posts by user-ID
Expand to see getPosts.js
// Get all posts
get('posts');
// Group posts by user
fn(state => {
const posts = state.data;
// Group posts by userId
const groupPostsByUserId = posts.reduce((acc, post) => {
const existingValue = acc[post.userId] || [];
return { ...acc, [post.userId]: [...existingValue, post] };
}, {});
// console.log(groupPostsByUserId);
return { ...state, groupPostsByUserId };
});
// Log posts where userId = 1
fn(state => {
const { groupPostsByUserId } = state;
const posts = groupPostsByUserId[1];
// console.log("Post with userId 1", groupPostsByUserId[1]);
return { ...state, posts };
});
each('posts[*]', state => {
console.log('Post', JSON.stringify(state.data, null, 2));
return state;
});
Notice how this code uses the each
function, a helper function defined in
language-common
but accessed from this job that is using language-http
. Most adaptors import
many functions from language-common
.
Run openfn getPosts.js -a http -s tmp/state.json -o tmp/output.json
Expand to see expected CLI logs
[CLI] ✔ Compiled job from getPosts.js
GET request succeeded with 200 ✓
[R/T] ✔ Operation 1 complete in 730ms
[R/T] ✔ Operation 2 complete in 0ms
[R/T] ✔ Operation 3 complete in 0ms
[JOB] ℹ Posts [
// Posts
]
[R/T] ✔ Operation 4 complete in 10ms
[CLI] ✔ Writing output to tmp/output.json
[CLI] ✔ Done in 1.091s! ✨
7. Running Workflows
Running a workflow allows you to define a list of steps and rules for executing them. You can use a workflow to orchestrate the flow of data between systems in a structured and automated way.
For example, if you have two steps in your workflow (GET users from system A & POST users to system B), you can set up your workflow to run all steps in sequence from start to finish. This imitates the flow trigger patterns on the OpenFn platform where a second job should run after the first one succeeds, using the data returned from the first job.
You won't have to assemble the initial state of the next job, the final state of the upstream job will automatically be passed down to the downstream job as the initial state.
Workflow
A workflow is the execution plan for running several steps in a sequence. It is defined as a JSON object that consists of the following properties:
{
"options": {
"start": "a" // optionally specify the start node (defaults to steps[0])
},
"workflow": {
"steps": [
{
"id": "a",
"expression": "fn((state) => state)", // code or a path
"adaptor": "@openfn/language-common@1.75", // specifiy the adaptor to use (version optional)
"state": {
"data": {} // optionally pre-populate the data object (this will be overriden by keys in in previous state)
},
"configuration": {}, // Use this to pass credentials
"next": {
// This object defines which steps to call next
// All edges returning true will run
// If there are no next edges, the workflow will end
"b": true,
"c": {
"condition": "!state.error" // Note that this is an expression, not a function
}
}
}
]
}
}
Example of a workflow
Here's an example of a simple workflow that consists of three steps:
{
"options": {
"start": "getPatients"
},
"workflow": {
"steps": [
{
"id": "getPatients",
"adaptor": "http",
"expression": "getPatients.js",
"configuration": "tmp/http-creds.json",
"next": {
"getGlobalOrgUnits": true
}
},
{
"id": "getGlobalOrgUnits",
"adaptor": "common",
"expression": "getGlobalOrgUnits.js",
"next": {
"createTEIs": true
}
},
{
"id": "createTEIs",
"adaptor": "dhis2",
"expression": "createTEIs.js",
"configuration": "tmp/dhis2-creds.json"
}
]
}
}
tmp/http-creds.json
{
"baseUrl": "https://jsonplaceholder.typicode.com"
}
tmp/dhis2-creds.json
{
"hostUrl": "https://play.im.dhis2.org/dev",
"password": "district",
"username": "admin"
}
getPatients.js
// Get users from jsonplaceholder
get('users');
// Prepare new users as new patients
fn(state => {
const newPatients = state.data;
return { ...state, newPatients };
});
getGlobalOrgUnits.js
// Globals: orgUnits
fn(state => {
const globalOrgUnits = [
{
label: 'Njandama MCHP',
id: 'g8upMTyEZGZ',
source: 'Gwenborough',
},
{
label: 'Njandama MCHP',
id: 'g8upMTyEZGZ',
source: 'Wisokyburgh',
},
{
label: 'Njandama MCHP',
id: 'g8upMTyEZGZ',
source: 'McKenziehaven',
},
{
label: 'Njandama MCHP',
id: 'g8upMTyEZGZ',
source: 'South Elvis',
},
{
label: 'Ngelehun CHC',
id: 'IpHINAT79UW',
source: 'Roscoeview',
},
{
label: 'Ngelehun CHC',
id: 'IpHINAT79UW',
source: 'South Christy',
},
{
label: 'Ngelehun CHC',
id: 'IpHINAT79UW',
source: 'Howemouth',
},
{
label: 'Ngelehun CHC',
id: 'IpHINAT79UW',
source: 'Aliyaview',
},
{
label: 'Baoma Station CHP',
id: 'jNb63DIHuwU',
source: 'Bartholomebury',
},
{
label: 'Baoma Station CHP',
id: 'jNb63DIHuwU',
source: 'Lebsackbury',
},
];
return { ...state, globalOrgUnits };
});
createTEIs.js
fn(state => {
const { newPatients, globalOrgUnits } = state;
const getOrgUnit = city =>
globalOrgUnits.find(orgUnit => orgUnit.source === city).id;
const mappedEntities = newPatients.map(patient => {
const [firstName = 'Patient', lastName = 'Test'] = (
patient.name || ''
).split(' ');
const orgUnit = getOrgUnit(patient.address.city);
const attributes = [
{ attribute: 'w75KJ2mc4zz', value: firstName },
{ attribute: 'zDhUuAYrxNC', value: lastName },
{ attribute: 'cejWyOfXge6', value: 'Male' },
];
return { ...patient, attributes: attributes, orgUnit: orgUnit };
});
return { ...state, mappedEntities };
});
each(
'mappedEntities[*]',
create('trackedEntityInstances', {
orgUnit: dataValue('orgUnit'),
trackedEntityType: 'nEenWmSyUEp',
attributes: dataValue('attributes'),
})
);
Run openfn [path/to/workflow.json]
to execute the workflow.
For example, if you created workflow.json
in the root of your project directory, this would be the project structure:
devchallenge
├── .gitignore
├── getPatients.js
├── createTEIs.js
├── getGlobalOrgUnits.js
├── workflow.json
└── tmp
├── http-creds.json
├── dhis2-creds.json
└── output.json
openfn workflow.json -o tmp/output.json
On execution, this workflow will first run the getPatients.js
job. If is
successful, getGlobalOrgUnits.js
will run using the final state of
getPatients.js
. If getGlobalOrgUnits.js
is successful, createTEIs.js
will
run using the final state of getGlobalOrgUnits.js
.
Note that adaptors specified in the workflow.json
will be auto-installed when
you execute the workflow. To execute the workflow run this command:
openfn workflow.json -o tmp/output.json
On execution, this workflow will first auto-install the adaptors then run the workflow
When working with the workflow.json
file, it is important to handle sensitive
information, such as credentials and initial input data, in a secure manner. To
ensure the protection of your sensitive data, please follow the guidelines
outlined below:
Configuration Key: In the
workflow.json
file, specify a path to a git ignored configuration file that will contain necessary credentials that will be used to access the destination system. For example:{
...
"configuration": "tmp/openMRS-credentials.json"
},Data Key: In case you need to pass initial data to your job, specify a path to a gitignored data file
{
...
"state": {
"data": "tmp/initial-data.json",
}
}