Joining lists with python3.6
Few days ago I was wondering what is the best way to perform a “join” with two different list, using Python3.6. The point is to try dont make a cartesian product and filter after that. The third solution seems the best way in performing but it is not the clearest way.
Here it is the three approaches for joining lists with Python:
1. Nested loops
def join(jobids, stages):
for jobid in jobids:
for stage in stages:
stageids = jobid[1]
if stage['stageId'] in stageids:
stage['jobid'] = jobid[0]
complexity -> O(n^2)
2. Nested loops and itertools
package
def join(jobids, stages):
for jobid, stage in itertools.product(jobids, stages):
if stage['stageId'] in jobid[1]:
stage['jobId'] = jobid[0]
complexity -> O(n^2)
3. Reverse index and toolz
package
def join(jobids, stages):
idx = {stage: job for job, stages in jobids for stage in stages}
return [toolz.dicttoolz.assoc(stage, 'jobId', idx[stage['stageId']]) for stage in stages]
complexity -> O(n)
Data
jobids = [(76, [75, 77, 88]), (75, [74]), (74, [73])]
stages = [{'status': 'COMPLETE', 'stageId': 75, ...},
{'status': 'COMPLETE', 'stageId': 74, ...}]