How to implement FULL OUTER JOIN in sqlalchemy on orm level.
Here my code:
q1 = (db.session.query( tb1.user_id.label('u_id'), func.count(tb1.id).label('tb1_c') ) .group_by(tb1.user_id) ) q2 = (db.session.query( tb2.user_id.label('u_id'), func.count(tb2.id).label('tb2_c') ) .group_by(tb2.user_id) )
above two queries and I want to apply FULL OUTER JOIN on them.
First of all,
sqlalchemy does not support
FULL JOIN out of the box, and for some good reasons. So any solution proposed will consist of two parts:
- a work-around for missing functionality
sqlalchemysyntax to build a query for that work-around
Now, for the reasons to avoid the
FULL JOIN, please read some old blog Better Alternatives to a FULL OUTER JOIN.
From this very blog I will take the idea of how to avoid
FULL JOIN by adding
0 values to the missing columns and aggregating (
UNION ALL intead. SA code might look something like below:
q1 = (session.query( tb1.user_id.label('u_id'), func.count(tb1.id).label('tb1_c'), literal(0).label('tb2_c'), # @NOTE: added 0 ).group_by(tb1.user_id)) q2 = (session.query( tb2.user_id.label('u_id'), literal(0).label('tb1_c'), # @NOTE: added 0 func.count(tb2.id).label('tb2_c') ).group_by(tb2.user_id)) qt = union_all(q1, q2).alias("united") qr = select([qt.c.u_id, func.sum(qt.c.tb1_c), func.sum(qt.c.tb2_c)]).group_by(qt.c.u_id)
Having composed the query above, I actually might consider other options:
- simply execute those two queries separately and aggregate the results already in Python itself (for not so large results sets)
- given that it looks like some kind of reporting functionality rather than business model workflow, create a
SQLquery and execute it directly via
engine. (only if it really is much better performing though)