Hierarchy to bottom level concepts. It is noticeable that

Hierarchy employed in this paper is based on Clinical Classifications Software (CCS) cite{healthcare2010clinical} for the ICD-9-CM taxonomy of diagnoses. The CCS for ICD-9-CM is one in a family of databases and software tools developed by Healthcare Cost and Utilization Project cite{hcupnet2003utilization}, which is based on a Federal-State-Industry partnership and sponsored by Agency for Healthcare Research and Quality. CCS for ICD-9-CM is created to inform decision-making at the national, State, and community levels. It is developed as a tool for clustering patient diagnoses and procedures into a manageable number of clinically meaningful categories. CCS for ICD-9-CM used in this paper is multi-level CCS (hierarchical system) which groups single-level CCS categories into broader body systems or condition categories (e.g.

, “Diseases of the Circulatory System”, “Mental Disorders”, and “Injury”). The multi-level system has four levels for diagnoses and three levels for procedures, which provide the opportunity to examine general groupings or to assess very specific conditions and procedures. Both, diagnosis and procedures are considered as concepts. Each concept is connected to another with an ‘is-a’ relationship. The CCS for ICD-9-CM hierarchy enables a traverse from top level concepts to bottom level concepts.

We Will Write a Custom Essay Specifically
For You For Only $13.90/page!


order now

It is noticeable that CCS for ICD-9-CM resembles a forest (defined by a majority of trees which are not connected). A fragment of this hierarchy is presented in Fig. 1, representing a generic concept name on the top and a corresponding CCS for ICD-9-CM code range associated with that concept up to a most specific concept on the bottom level.Instead of using the domain CCS hierarchy, we also try to derive a hierarchy from the sets of readmitted diagnosis and use this hierarchy in the learning and prediction phases in order to improve the predictive performance. While building hierarchy over the label space, there is only one constraint that we should take care of: the original multi-label classification task should be defined by the leaves of the label hierarchy. In particular, the labels from the original multi-label classification problem represent the leaves of the tree hierarchy, while the labels that represent the internal nodes of the tree hierarchy are so called meta-labels (that model the correlation among the original labels). In work cite{madjarov2014evaluation}, we can see the use of label hierarchies in multi-label classification, developed in a data-driven manner. They consider flat label sets and construct label hierarchies from the label sets that appear in the annotations of the training data by using clustering approaches based on balanced k-means clustering cite{tsoumakas2008effective}, agglomerative clustering with single and complete linkage cite{manning2009information}, and clustering performed with PCTs.

Multi-branch hierarchy (defined by balanced k-means clustering) appears much more suitable PCTs for hierarchical multi-label classification as compared to the binary hierarchies defined by agglomerative clustering with single and complete linkage and PCTs. In this work, for deriving the hierarchy of the (original) multi-label classification problem, we employ the balanced k-means and the agglomerative clustering approaches.