HES APC - Procedures
The hes_apc_procedures asset is curated from the latest archived version of the HES APC procedures table (hes_apc_otr_all_years_archive). The output is a long-format table where each row represents an individual three-digit or four-digit OPCS procedure code (opertn_01, …, opertn_nn) associated with a specific individual and hospital episode. Procedure codes are cleaned by removing non-alphanumeric characters and rows where codes are null, or an empty string, are removed ensuring only valid OPCS codes are retained. Both three-digit procedure codes and their corresponding four-digit variants (where available) are represented, ensuring consistency across code granularities. The resulting table includes 10 columns: 6 identifier columns (person ID, episode key, episode start date, episode end date, procedure date, admission date and discharge date) and 3 columns describing the procedure code and position:
- code: the OPCS procedure code
- code_digits: indicates whether the procedure code is the three- or four-digit version
- position: indicates the position of the procedure within the episode (eg., 1–n, corresponding to opertn_01, opertn_02, …)
The table is saved to the DSA schema dsa_391419_j3w9t_collab. The archived_on_date is in the format YYYY_MM_DD.
Table Name
hds_curated_assets__hes_apc_procedure_archived_on_date
The below code will load the hes_apc_diagnosis table as at October 2024 using PySpark:
```python
import pyspark.sql.functions as f
dsa = f'dsa_391419_j3w9t_collab'
hes_apc_procedure = spark.table(f'{dsa}.hds_curated_assets__hes_apc_procedure_2024_10_01')
```