Skip to content

chain functions broken due to oa prefix in new batch-machine #18

@andrewharvey

Description

@andrewharvey

https://github.com/openaddresses/openaddresses/blob/master/ATTRIBUTE_FUNCTIONS.md#chain documents use of chain functions with a _wip variable, the example given is:

"street": {
    "function": "chain",
    "variable": "street_wip",
    "functions": [
        {
            "function": "postfixed_street",
            "field": "Prop_Addr"
        },
        {
            "function": "remove_postfix",
            "field": "street_wip",
            "field_to_remove": "Prop_Addr_Unit"
        }
    ]
}

This used to work fine with https://github.com/openaddresses/machine but once that we deprecated in favour of https://github.com/openaddresses/batch-machine a breaking change was introduced which meant this example would now fail as evidenced by the errors in openaddresses/openaddresses#5496

After a few months of not understanding the issue, I took another look comparing the old conform https://github.com/openaddresses/machine/blob/05db17d8492b3d8f4064f0f5b0ca9c68041c535a/openaddr/conform.py with the new one https://github.com/openaddresses/batch-machine/blob/8bb6ecfb20beec56913ae7ae267cf5489c9c35ad/openaddr/conform.py this revealed an issue at

def row_fxn_chain(sc, row, key, fxn):
functions = fxn["functions"]
var = fxn.get("variable")
original_key = key
if var and var.upper().lstrip('OA:') not in sc.SCHEMA and var not in row:
row['oa:' + var] = u''
key = var
for func in functions:
row = row_function(sc, row, key, func)
if row.get('oa:' + key):
row[key] = row['oa:' + key]
row['oa:{}'.format(original_key.lower())] = row['oa:{}'.format(key)]
return row

The code here now creates these intermediate variables with the oa: prefix, once I added this prefix to the _wip variable fields at https://github.com/openaddresses/openaddresses/pull/5828/files#diff-8441ea2e82820b7decef018f2d606c8a543ad86411a2c49aedab254a0f72f848 it worked.

So the question for you @ingalls is, is this either

a) a bug in batch-machine that we can fix, or
b) a change in how these conform files are processed and we should update the documentation and existing sources which broke

I'm happy to help out, but I need to first understand some of the background and motivation of this change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions