Question 16: Nagarro DE interview questions part2 | data engineer |
In this video I have discussed on Interview question asked in Nagarro interview for data engineers.
Find the medications that were prescribed by at least three different doctors.
Relevant DF:
df1 = medications (medication_id, medication_name),
df2 = prescriptions (prescription_id, doctor_id, medication_id)
medications_data = [
(1, "Medication A"),
(2, "Medication B"),
(3, "Medication C"),
(4, "Medication D"),
(5, "Medication E")
]
prescriptions_data = [
(1, 1, 1),
(2, 2, 1),
(3, 3, 1),
(4, 1, 2),
(5, 2, 2),
(6, 3, 2),
(7, 1, 3),
(8, 2, 4),
(9, 3, 4),
(10, 4, 5),
(11, 5, 5),
(12, 6, 5)
]
medications_df = spark.createDataFrame(medications_data, ["medication_id", "medication_name"])
prescriptions_df = spark.createDataFrame(prescriptions_data, ["prescription_id", "doctor_id", "medication_id"])
Part 1: • Question 15: Nagarro D...
Part 3: • Question 17: Nagarro D...
Check out this video and do let me know your doubts we can connect on
linkedIn : / priyam-jain-0946ab199
Do subscribe @pysparkpulse for more such Questions.
#pyspark #spark #bigdata #bigdataengineer #dataengineering #dataengineer #nagarro #mnc #interviewquestions #dataengineer #bigdatainterview
Пікірлер: 9
Thank you Sir
@pysparkpulse
3 ай бұрын
Most welcome
Very helpful, thanks for sharing :)
@pysparkpulse
5 ай бұрын
Thank You. Krishna
%sql with result as ( select medication_id, count(distinct(doctor_id)) as count_distinct_doc from prescriptions group by medication_id having count_distinct_doc >=3) select m.medication_name from result r join medications m on r.medication_id = m.medication_id
@pysparkpulse
4 ай бұрын
Yes correct 💯😀
select temp.medication_name, count(temp.doctor_id) dc from (select md.medication_id, medication_name, doctor_id from medications_data md inner join prescriptions_data pd on md.medication_id=pd.medication_id) temp group by temp.medication_name having dc=3 PostgreSQL Solution
@pysparkpulse
5 ай бұрын
Great 😃