How to parse OBX segments from an ORU-01 HL7 message using Java and HAPI

Tagged hapi, hl7, java, obx, oru  Languages java, xml

Tested with HAPI 2.3:

import ca.uhn.hl7v2.model.v23.message.ORU_R01;
import ca.uhn.hl7v2.model.v23.group.*;
import ca.uhn.hl7v2.model.v23.segment.OBX;
import ca.uhn.hl7v2.model.v23.segment.OBR;
import ca.uhn.hl7v2.model.Varies;
import org.junit.BeforeClass;
import ca.uhn.hl7v2.HL7Exception;
import ca.uhn.hl7v2.model.Message;
import ca.uhn.hl7v2.parser.PipeParser;
import ca.uhn.hl7v2.util.Terser;

/*
 * An HL7 message has the following components:
 *
 * MSH - Message Header Segment
 * RESPONSE (repeating)
 *   PATIENT
 *     PID - Patient Identification Segment
 *     PV1 – Patient Visit Segment
 *   ORDER OBSERVATION (repeating)
 *     OBR - Observation Request Segment
 *       ORC - Common Order Segment
 *       OBR - Observation Request Segment
 *       OBX – Observation Segment (repeating)
 *
 */

PipeParser pipeParser = new PipeParser();
pipeParser.setValidationContext(new ca.uhn.hl7v2.validation.impl.NoValidation());
Message message = pipeParser.parse(m);
ORU_R01 oru = (ORU_R01) message;
MSH msh = oru.getMSH();
String sendingApp = msh.getSendingApplication().encode();
String sendingFacility = msh.getSendingFacility().encode();
//
// ORU_R01_RESPONSE group structure (a Group object)
//
// 1: ORU_R01_PATIENT (a Group object) optional
// 2: ORU_R01_ORDER_OBSERVATION (a Group object) repeating
//
// See https://hapifhir.github.io/hapi-hl7v2/v23/apidocs/src-html/ca/uhn/hl7v2/model/v23/group/ORU_R01_RESPONSE.html
//
for (ORU_R01_RESPONSE response : oru.getRESPONSEAll()) {
  //
  // ORU_R01_ORDER_OBSERVATION group structure (a Group object)
  //
  // 1: ORC (Common order segment) optional
  // 2: OBR (Observation request segment)
  // 3: NTE (Notes and comments segment) optional repeating
  // 4: ORU_R01_OBSERVATION (a Group object) repeating
  // 5: CTI (Clinical Trial Identification) optional repeating
  //
  // See https://hapifhir.github.io/hapi-hl7v2/v23/apidocs/src-html/ca/uhn/hl7v2/model/v23/group/ORU_R01_ORDER_OBSERVATION.html
  //
  for (ORU_R01_ORDER_OBSERVATION orderObservation : response.getORDER_OBSERVATIONAll()) {
    OBR obr = orderObservation.getOBR();
    String fillerOrderNumber = obr.getObr3_FillerOrderNumber().encode();
    //
    // ORU_R01_OBSERVATION group structure (a Group object)
    //
    // 1: OBX (Observation segment) optional
    // 2: NTE (Notes and comments segment) optional repeating
    //
    // See https://hapifhir.github.io/hapi-hl7v2/v23/apidocs/src-html/ca/uhn/hl7v2/model/v23/group/ORU_R01_OBSERVATION.html
    //
    for (ORU_R01_OBSERVATION observation : orderObservation.getOBSERVATIONAll()) {
      //
      // HL7 OBX message segment (Observation segment)
      //
      // https://hapifhir.github.io/hapi-hl7v2/v23/apidocs/src-html/ca/uhn/hl7v2/model/v23/segment/OBX.html
      //
      OBX obx = observation.getOBX();
      String type = obx.getObx3_ObservationIdentifier().getCe2_Text().getValue();
      String status = obx.getObservResultStatus().getValue();
      for (Varies varies : obx.getObx5_ObservationValue()) {
        String value = varies.encode();
        log.info("value {} type {} status {}", value, type, status);
      }
    }
  }
}
    <dependency>
      <groupId>ca.uhn.hapi</groupId>
      <artifactId>hapi-base</artifactId>
      <version>${hapi.version.stable}</version>
    </dependency>
    <dependency>
      <groupId>ca.uhn.hapi</groupId>
      <artifactId>hapi-structures-v23</artifactId>
      <version>${hapi.version.stable}</version>
    </dependency>

Use the HAPI test panel to inspect and find the names of elements, e.g. PID-3-4: https://hapifhir.github.io/hapi-hl7v2/hapi-testpanel/install.html

How to use LATERAL as a for loop in Postgres

Tagged lateral, loop, postgres  Languages sql

This example shows how to use LATERAL as a for loop in Postgres. The query will perform a lateral subquery for each month. The subquery returns the number of active courses during each month.

-- A list of months and their start and end timestamps
WITH months AS (
  SELECT start, start + (interval '1' month) - (interval '1' second) AS end FROM (
    SELECT generate_series(DATE('2010-01-01'), DATE('2010-12-01'), interval '1 month') AS start
  ) months
)
-- For each month, run a query that returns the number of active courses during that month
SELECT * FROM months AS m
LEFT JOIN LATERAL (
  SELECT count(id)
  FROM courses AS c
  WHERE
    (c.start <= m.end AND c.start >= m.start) OR
    (c.end <= m.end AND c.end IS NULL)
) AS results ON TRUE;

How to generate a series with the start and end dates of months with Postgres

Tagged generate_series, months, sql  Languages sql
WITH m AS (
  SELECT generate_series(DATE('2010-01-01'), DATE('2010-12-01'), interval '1 month') AS start
)
SELECT m.start, m.start + (interval '1' month) - (interval '1' second) AS end FROM m;
┌────────────────────────┬────────────────────────┐
│         start          │          end           │
├────────────────────────┼────────────────────────┤
│ 2010-01-01 00:00:00+02 │ 2010-01-31 23:59:59+02 │
│ 2010-02-01 00:00:00+02 │ 2010-02-28 23:59:59+02 │
│ 2010-03-01 00:00:00+02 │ 2010-03-31 23:59:59+03 │
│ 2010-04-01 00:00:00+03 │ 2010-04-30 23:59:59+03 │
│ 2010-05-01 00:00:00+03 │ 2010-05-31 23:59:59+03 │
│ 2010-06-01 00:00:00+03 │ 2010-06-30 23:59:59+03 │
│ 2010-07-01 00:00:00+03 │ 2010-07-31 23:59:59+03 │
│ 2010-08-01 00:00:00+03 │ 2010-08-31 23:59:59+03 │
│ 2010-09-01 00:00:00+03 │ 2010-09-30 23:59:59+03 │
│ 2010-10-01 00:00:00+03 │ 2010-10-31 23:59:59+02 │
│ 2010-11-01 00:00:00+02 │ 2010-11-30 23:59:59+02 │
│ 2010-12-01 00:00:00+02 │ 2010-12-31 23:59:59+02 │
└────────────────────────┴────────────────────────┘

Same query without the CTE:

SELECT start, start + (interval '1' month) - (interval '1' second) AS end FROM (
  SELECT generate_series(DATE('2010-01-01'), DATE('2010-12-01'), interval '1 month') AS start
) months;

How to crack an MD5 password using hashcat

Tagged crack, hashcat, md5  Languages bash

Brute-force attack

Install hashcat:

brew install hashcat

Write the MD5 hashes that we want hashcat to crack for us to a file:

echo '098f6bcd4621d373cade4e832627b4f6' >> hashes
echo '1a1dc91c907325c69271ddf0c944bc72' >> hashes

Attempt to crack MD5 password hash using brute force (“-a 3” switch):

$ hashcat -a 3 hashes

Show cracked hashes and passwords:

$ hashcat -a 3 hashes --show
098f6bcd4621d373cade4e832627b4f6:test
1a1dc91c907325c69271ddf0c944bc72:pass

Dictionary attack

Download a dictionary:

wget https://raw.githubusercontent.com/berzerk0/Probable-Wordlists/master/Real-Passwords/Top196-probable.txt

Write the MD5 hashes that we want hashcat to crack for us to a file:

echo '5f4dcc3b5aa765d61d8327deb882cf99' >> hashes

Attempt to crack the password using the dictionary:

hashcat -a 0 hashes Top196-probable.txt

Show cracked hashes and passwords:

$ hashcat -a 3 hashes --show
5f4dcc3b5aa765d61d8327deb882cf99:password

How to decode base64 encoded MD5 password hash

To decode a base64 encoded MD5 password hash you can use these commands:

echo -n 'base64-encoded-md5-password' | base64 -D | xxd -g 0 -ps > hashes

Tested with hashcat v4.0.1.

Database diff

Tagged database, diff, postgres  Languages bash, ruby, yaml

This script will connect to two databases, named old and new, and print:

  • the names of the tables that cannot be found in the new database
  • the difference in the amount of rows per table, if there is a difference in the number of rows

db_diff.rb:

#
# Postgres database diff script. Prints the names of missing tables and the
# difference in the amount of rows per table.
#
require 'sequel'
require 'pg'
require 'yaml'
require 'pry'

list_tables_sql = "SELECT tablename from pg_tables where schemaname = 'public';"
OLD_DB = Sequel.connect(YAML.load(File.read('old_database.yml')))
NEW_DB = Sequel.connect(YAML.load(File.read('new_database.yml')))
OLD_TABLES = OLD_DB[list_tables_sql].map{ |x| x.fetch(:tablename) }
NEW_TABLES = NEW_DB[list_tables_sql].map{ |x| x.fetch(:tablename) }

# Compare tables
def diff_tables
  OLD_TABLES - NEW_TABLES
end

# Compare row count
def diff_row_count
  OLD_TABLES.sort.reduce({}) do |hash, table|
    sql = "SELECT count(*) FROM %{table}"
    # Sequel's count method does not work.
    diff = OLD_DB[sql % {table: table}].first[:count].to_i - NEW_DB[sql % {table: table}].first[:count].to_i
    hash[table] = diff if diff != 0
    hash
  end
end

puts JSON.pretty_generate(tables: diff_tables, rows: diff_row_count)

Gemfile:

source 'https://rubygems.org'

gem 'sequel'
gem 'pg'
gem 'pry'

new_database.yml and old_database.yml:

adapter: postgres
host: localhost
port: 5432
encoding: unicode
database: x_production
username: x
password: x

To run the script:

gem install bundler
bundle
bundle exec ruby db_diff.rb

Detecting software version with Ansible

Tagged ansible, version  Languages yaml

With Ansible, detecting the version of, for example, Redis or Racket can be done like this:

- name: Detect Redis version
  # Input: Redis server v=3.2.1 sha=00000000:0 malloc=libc bits=64 build=62a67eec83b28403
  # Output: 3.2.1
  shell: redis-server -v | awk '{print $3}' | sed -e 's/v=//'
  changed_when: False
  register: redis_installed_version

- name: Detect racket versions
  # Input: Welcome to Racket v6.6.
  # Output: 6.6
  shell: "racket -v | rev | cut -d ' ' -f1 | rev | sed 's/.$//' | sed 's/^v//'"
  register: racket_installed_version

Example: Download src only if version does not match:

- get_url:
    url: http://download.redis.io/releases/redis-{{redis_version}}.tar.gz
    dest: /usr/local/src/
    sha256sum: "{{redis_sha256}}"
  register: get_redis_result
  when: redis_installed_version.stdout | version_compare(redis_version, '!=')

How to set the setgid flag with Ansible

Tagged ansible, setgid, setuid  Languages yaml

This is how to set the setgid flag with Ansible. Tested with Ansible version 2.4.

- name: Create directories and set setguid
  file:
    path: "{{item}}"
    state: directory
    owner: www-data
    group: www-data
    # NOTE: 2 = setguid flag. The prefix 0 is required by Ansible
    mode: 02770
    recurse: true
  with_items:
    - /var/www/
  become: true

Reference

Setting the setgid permission on a directory (“chmod g+s”) causes new files and subdirectories created within it to inherit its group ID, rather than the primary group ID of the user who created the file (the owner ID is never affected, only the group ID).

See https://en.wikipedia.org/wiki/Setuid

Elasticsearch: How to delete log entries older than X days with Curator

Tagged curator, elasticsearch  Languages bash, cron
# Install curator
pip install curator
# Download curator config file
curl -o curator.yml https://raw.githubusercontent.com/elastic/curator/master/examples/curator.yml

Next, download, read, and edit the action file: https://www.elastic.co/guide/en/elasticsearch/client/curator/current/actionfile.html

# Run curator
curator --config curator.yml action_file.yml

Add this to crontab:

# Run curator at 00:01
01 00 * * * /usr/local/bin/curator --config /etc/elasticsearch/curator/curator.yml /etc/elasticsearch/curator/remove-old-data.yml >> /var/log/elasticsearch-cu
rotor.log

Tested with Elasticsearch 6.0 and curator version 5.4.1.

GDPR

Tagged encryption, eu, gdpr, pseudonymization, masking  Languages 
  • In order to be able to demonstrate compliance with the GDPR, the data controller should implement measures which meet the principles of data protection by design and data protection by default.

  • Privacy by Design and by Default (Article 25) require that data protection measures are designed into the development of business processes for products and services. Such measures include pseudonymising personal data, by the controller, as soon as possible (Recital 78).

  • Although the GDPR encourages the use of pseudonymisation to “reduce risks to the data subjects,” (Recital 28) pseudonymised data is still considered personal data (Recital 26) and therefore remains covered by the GDPR.

  • However, the notice to data subjects is not required if the data controller has implemented appropriate technical and organizational protection measures that render the personal data unintelligible to any person who is not authorized to access it, such as encryption (Article 34).

  • Records of processing activities must be maintained, that include purposes of the processing, categories involved and envisaged time limits. These records must be made available to the supervisory authority on request.[24] (article 30).

https://en.wikipedia.org/wiki/General_Data_Protection_Regulation

  • Pseudonymization is a central feature of “data protection by design.”

  • Companies that encrypt their personal data also gain the advantage of not having to notify data subjects in the case of a breach. (They still, though, would have to notify the local DPA.)

  • Under Article 32, controllers are required to implement risk-based measures for protecting data security. One such measure is the “pseudonymization and encryption of personal data”

  • The GDPR addresses the first concern in Recital 75, which instructs controllers to implement appropriate safeguards to prevent the “unauthorized reversal of pseudonymization.” To mitigate the risk, controllers should have in place appropriate technical (e.g., encryption, hashing or tokenization) and organizational (e.g., agreements, policies, privacy by design) measures separating pseudonymous data from an identification key.

https://iapp.org/news/a/top-10-operational-impacts-of-the-gdpr-part-1-data-security-and-breach-notification/

  • In this regard, the GDPR expressly says that businesses should consider implementing “as appropriate … the pseudonymisation and encryption of personal data.” While the law stops short of telling businesses they must implement pseudonymisation, the express reference to pseudonymisation in the security provisions of the GDPR is highly significant – indicating that, in the event of a security breach, regulators will take into consideration whether or not a business had implemented pseudonymisation technologies. Businesses that have not may therefore find themselves more exposed to regulatory action.

  • if a data breach presents low risk to the individuals concerned, the GDPR’s breach notification requirements become more relaxed. Pseudonymisation, whether through masking, hashing or encryption, offers a clear means to reduce the risks to individuals arising from a data breach (e.g. by reducing the likelihood of identity fraud and other forms of data misuse), and is supported by the GDPR as a security measure as already described above.

http://www.directcommercemagazine.com/UserContent/doc/12624/delphix%20gdpr%20for%20data%20masking.pdf

Tools

  • Data masking

https://www.mssqltips.com/sqlservertip/3091/masking-personal-identifiable-sql-server-data/

  • Data encryption

PostgreSQL 10 With Streaming Replication and PITR

Tagged pitr, postgresql, replication, standby, streaming  Languages bash, sql

Notes on how to configure streaming replication and PITR with PostgreSQL 10.

Goal

  • Master and slave (warm standby)

    • Master database backed up to slave server via streaming replication
    • WAL files copied from master to a network drive
  • Point-in-time-recovery (PITR), e.g., if someone deletes a table by mistake
  • Recovery possible even if both master and slave are lost
  • Daily and weekly backups available on a network drive
  • WAL files available on a network drive
  • Network drive backed up

Steps

  1. Install postgres 10 on master and slave
  2. Configure the master

    • wal_level=replica, archive_mode=on, archive_command, wal_keep_segments (optional with replication slots), etc in postgresql.conf
    • archive_command should copy WAL files to a shared network drive for additional redundancy, or the slave
    • create a replication slot for each slave, so that WAL files are stored long enough for slaves to receive them:
SELECT * FROM pg_create_physical_replication_slot('pg-slave-1');
  1. Configure the slave

    • hot_standby=on, etc in postgresql.conf (keep slave configuration as identical as possible to master)
    • primary_slot_name = ‘pg-slave-1’, standby_mode=on, restore_command, primaryconn_info, trigger_file in recovery.conf
    • restore_command should use the WAL files on the network drive that are copied there from the master
  2. Start primary server, stop the slave

Make sure, e.g., monit does not start the slave immediately again.

  1. Copy the master database to the slave with pg_basebackup

Make sure the slave’s data directory is empty:

psql -c '\db'
sudo /etc/init.d/postgresql stop
sudo rm -rf /var/lib/postgresql/10/main
sudo mkdir /var/lib/postgresql/10/main
sudo pg_basebackup -h 10.0.0.1 -D /var/lib/postgresql/10/main -P -U replicator -X stream -W
sudo chown -R postgres:postgres /var/lib/postgresql/10/main
sudo chmod -R 0700 /var/lib/postgresql/10/main
  1. Start slave
sudo service postgresql start
  1. Set up daily backups

Configure daily backups of PostgreSQL data to a network drive.

  1. Backup the backup

Configure daily and weekly backups of network drive.

  1. Check replication status

On master:

select pg_current_wal_lsn();

On slave:

select pg_last_wal_replay_lsn();

Both values should match.

Scenarios

  • Master server is killed

Promote slave to master with touch /tmp/promote-slave-to-master.trigger

  • Master server is killed, slave replication has fallen behind master

Restore slave from the WAL files located on the network drive. Or use a daily backup plus the WAL files if replication has fallen behind too much.

  • Master server and slave are killed

Restore the database from a daily backup and WAL files located on the network drive.

  • Master server, slave, and network drive are killed

Restore the database from a daily backup and the WAL files located on another network drive.

  • “drop database xxx” was run by mistake

Restore the database with PITR. For example, set recovery_target_time = ‘2017-06-06 06:06:06’ in recovery.conf.

  • Additional slaves are needed

Configure the new slave. Remember to create a new replication slot.

  • Slave is removed permanently

Delete the replication slot or the WAL archives will keep accumulating until the master’s disk is full.

Troubleshooting

  • Replication not working

Is recovery.conf in the data directory? Was it there before the server was started?

References