HES APC - Procedures

View code on GitHub

The hes_apc_procedures asset is curated from the latest archived version of the HES APC procedures table (hes_apc_otr_all_years_archive). The output is a long-format table where each row represents an individual three-digit or four-digit OPCS procedure code (opertn_01, …, opertn_nn) associated with a specific individual and hospital episode. Procedure codes are cleaned by removing non-alphanumeric characters and rows where codes are null, or an empty string, are removed ensuring only valid OPCS codes are retained. Both three-digit procedure codes and their corresponding four-digit variants (where available) are represented, ensuring consistency across code granularities. The resulting table includes 10 columns: 6 identifier columns (person ID, episode key, episode start date, episode end date, procedure date, admission date and discharge date) and 3 columns describing the procedure code and position:

  • code: the OPCS procedure code
  • code_digits: indicates whether the procedure code is the three- or four-digit version
  • position: indicates the position of the procedure within the episode (eg., 1–n, corresponding to opertn_01, opertn_02, …)

Example

epikey person_id position procedure_date code code_digits epistart epiend admidate disdate
178954263574 A 1 2009-03-11 R18 3 2009-03-11 2009-03-13 2009-03-11 2009-03-13
178954263574 A 1 2009-03-11 R182 4 2009-03-11 2009-03-13 2009-03-11 2009-03-13
178954263574 A 2 2009-03-11 Y81 3 2009-03-11 2009-03-13 2009-03-11 2009-03-13
178954263574 A 2 2009-03-11 Y811 4 2009-03-11 2009-03-13 2009-03-11 2009-03-13
178954263574 A 3 2009-03-13 Y84 3 2009-03-13 2009-03-15 2009-03-13 2009-03-15
178954263574 A 3 2009-03-13 Y841 4 2009-03-13 2009-03-15 2009-03-13 2009-03-15
559478246553 B 1 09/12/2000 X29 3 2000-12-09 2000-12-09 2000-12-09 2000-12-09
559478246553 B 1 09/12/2000 X298 4 2000-12-09 2000-12-09 2000-12-09 2000-12-09

The table is saved to the DSA schema dsa_391419_j3w9t_collab. The archived_on_date is in the format YYYY_MM_DD.

Table Name

hds_curated_assets__hes_apc_procedure_archived_on_date

The below code will load the hes_apc_diagnosis table as at October 2024 using PySpark:

```python
import pyspark.sql.functions as f
dsa = f'dsa_391419_j3w9t_collab'
hes_apc_procedure = spark.table(f'{dsa}.hds_curated_assets__hes_apc_procedure_2024_10_01')
```