CMS PIPDCG Public Use Payment Model Software for 2003 Revised October 17, 2002 This memorandum provides documentation for using the CMS PIPDCG Public Use Payment Model Software for 2003. It was prepared by DxCG, Inc. under a subcontract to Health Economics Research (HER) on HCFA contract 500-99-0038. The software consists of two programs: PIPMODEL.TXT program that processes enrollment and claims input data, and calculates relative risk scores and predicted expenditures. FMTPROG.TXT program that creates the required SAS format library. Note: The software is SAS source code. The program takes the form needed to run on an IBM mainframe. Modifications are needed to run it on other platforms. For example, the JCL assigning files would be replaced by SAS statements or system-specific code. PIPMODEL.TXT is the main program that assigns PIPDCGs and calculates relative risk scores and predicted expenditures. PIPMODEL.TXT is written in the SAS programming language. It was developed in SAS Version 6.09 and should run correctly on that version or any more recent version. To run PIPMODEL.TXT successfully, a SAS FORMAT LIBRARY containing the crosswalk from ICD-9-CM diagnosis codes to PIPDCG is necessary. This SAS FORMAT LIBRARY is external to PIPMODEL.TXT. PIPMODEL.TXT utilizes the library to map ICD-9-CM diagnosis codes into PIPDXGs and to map PIPDXGs into PIPDCGs. The separate program FMTPROG.TXT creates the required SAS FORMAT LIBRARY containing the cross-walk. The JCL statement with the DDname LIBRARY in PIPMODEL.TXT points to the SAS FORMAT LIBRARY. Before running PIPMODEL.TXT, the user must run the program FMTPROG.TXT, which references the SAS FORMAT LIBRARY as output. The details of this reference depend on the user's platform. The remainder of this document explains PIPMODEL.TXT. The topics covered are: I. Input Files II. User-Defined Program Parameters III. Notes on Program Computations IV. Output File Appendix A: HER age/sex diagnosis validity edits I. Input Files Two SAS input files are required for the software: PERSON file--a person-level file of demographic and enrollment information; and ADMISSN file an admission-level file of diagnoses and length of stay. Data requirements for the SAS input files: A. PERSON Input File The person-level input file requires the following variables for each person: 1) IDNO IDNO must be character or numeric type and unique to an individual. It can, for example, be the Medicare HICNO. The PERSON Data Set must be presorted in ascending order by IDNO. 2) SEX One character: 1=male, 2=female. 3) DOB Date of birth. Numeric variable in format YYYYMMDD, where Y indicates year, M is month, and D is day. For example, someone born on June 23, 1930, would be coded as 19300623. Using DOB, SEX, YEAR2, and YR2MONTH (YEAR2 and YR2MONTH are user-defined program parameters specifying the year for which expenditures are being predicted - see Section II. below), the software computes a person's age and sex cells, which are defined as the fraction of eligible* months in YEAR2 that are spent in each cell. A person's age for each month in Year 2 is determined as their age as of the first day of the following month. Then the number of months in each age cell in Year 2 is determined, and divided by the number of eligible months to determine the fraction of the year in each age/sex cell. These cells may take on the values zero, fraction, or one, and sum to one for each person. Example: A woman was born on June 15, 1933. Expenditures are being predicted for YEAR2 = 2003, YR2MONTH=1 (January). For 5 months of YEAR2, the woman is 69 years old, and for seven months she is 70 years old. She receives the values 5/12 = 0.42 for the cell female, age 65-69 and 7/12 = 0.58 for the cell female, age 70-74. *Throughout this program, it is assumed that the number of eligible months in YEAR2 for each individual is 12. 4) OREC Original reason for Medicare entitlement. One character, which may take on the following values: 0 = old age and survivors insurance (OASI) 1 = disability insurance benefits (DIB) 2 = End Stage Renal Disease (ESRD) 3 = both DIB and ESRD Using OREC, DOB, YEAR2, and YR2MONTH the software computes EVERDISM ( ever disabled), which is defined as the fraction of YEAR2 eligible months spent in ever disabled status. EVERDISM may take on the value zero, fraction, or one. Ever disabled status refers to someone who was originally entitled to Medicare by disability, but is currently entitled by age (i.e. is currently 65 years of age or older). Someone is in ever disabled status if they are currently 65 years old or older, and OREC takes on the values 1, 2, or 3. 5) MCAID Numeric variable. Medicaid status, defined as 1 for year 2 (YEAR2) if the person was enrolled in Medicaid for at least one month in (the base) year 1, 0 otherwise. This variable takes on the values 0 or 1 for all of year 2; it cannot be a fraction. 6) MSP Numeric variable. Medicare as a secondary payer, or working aged status, defined as the fraction of eligible months in year 2 that the person is in working aged status. This variable can take the values zero, fraction, or one. MSP should be coded as 0 for the non-working aged. It should never be left missing. To derive a correct relative risk score and monthly predicted expenditure value for someone who is working aged in a given month, set this variable to 1. However if this person has non-working-aged months during the prediction year, the annualized predicted expenditure will not be correct, nor will the relative risk score the program outputs be correct for the entire year. To get the correct results for someone with a mixture of working aged and non-working-aged months, enter the fraction of eligible months in working aged status in year 2. For example, if a person is in working aged status for 6 of 12 months, enter 0.5 (in this case, the monthly prediction will be an average prediction across months in the year, and will be incorrect for any particular month because someone is either in working aged status or not in a particular month). 7) NEWENROL Numeric variable. 1 indicates a new beneficiary (Medicare enrollee). This indicator is used when the beneficiary has not been eligible for a full data year for the collection of encounters. 0 is a continuing beneficiary. New and continuing beneficiaries have different payment formulas. 8) CHFYEAR3 Character variable, length 1. 1 indicates that the enrollee was discharged from an acute care hospital between 7/01/99 and 6/30/00 with a greater than one day stay for a principal inpatient diagnosis of CHF. Otherwise, the value should be zero. A set of adjusted output variables will be calculated that may include additional payment components for beneficiaries with a prior CHF diagnosis identified by CHFYEAR3 = 1. 9) CHFYEAR4 Character variable, length 1. One indicates that the enrollee was discharged from an acute care hospital between 7/01/00 and 6/30/01 with a greater than one day stay for a principal inpatient diagnosis of CHF. Otherwise, the value should be zero. A set of adjusted output variables will be calculated that may include additional payment components for beneficiaries with a prior CHF diagnosis identified by CHFYEAR4 = 1. B. ADMISSN File The hospital admission level input file requires the following variables, where each record represents one hospital admission: 1) IDNO IDNO of person admitted. The specifications are the same as for PERSON file. IDNO on the ADMISSN file must be identical to the IDNO for the same person on the PERSON file. It can, for example, be a person's Medicare HICNO. The ADMISSN file must be presorted in ascending order by IDNO. 2) LOS Numeric variable. Length of stay, in days, defined as discharge date minus admission date. 3) DIAG1, DIAG2, DIAG3, , DIAG[MAXDIAG] (In the SAS computer language, DIAG1-DIAG&MAXDIAG) All ICD-9-CM diagnoses from this hospital stay. The principal diagnosis for the stay must be in the first position (DIAG1). All secondary diagnoses must follow without gaps on the record. The order of the secondary diagnoses does not matter. If a record (admission) has fewer than MAXDIAG, then the remaining diagnosis fields must be blank filled through MAXDIAG. (MAXDIAG is a user- defined program parameter specifying the maximum number of diagnoses allowed on a single admission record - see below.) Each diagnosis is a 5 character field. Diagnosis codes should be left- justified, include leading zeros, exclude periods, and should be right filled with blanks. Letter codes (i.e., V codes) should be UPPER case. Diagnosis codes not conforming to these specifications will be considered invalid. Examples ( b indicates a blank space): 003.2 should be coded 0032b NOT 003.2 NOT 0032 NOT bb32b NOT 32bbb 003.20 should be coded 00320 NOT 003.2b 650 should be coded 650bb NOT 65000 NOT 00650 NOT bb650 V57.0 should be coded V570b NOT v570b 806.21 should be coded 80621 NOT 806.21 Only fully specified ICD-9-CM codes are considered valid by the software. For example, 4-digit stems (roots) of codes requiring 5 digits are not considered valid. Any ICD-9-CM code valid for fiscal year 2000, 2001 or 2002 will be processed by the software, all others will be considered invalid and will be ignored. II. User-Defined Program Parameters The user can set the following program parameters in Part 1, Step 1 of the SAS program. If they are not changed by the user, they assume the default values indicated. 1) LOS01 = 1 includes diagnoses from all hospital admissions in assigning PIPDCG. = 0 ignores diagnoses from all hospital admissions with LOS of less than 2 in assigning PIPDCG. Default = 0 Important note: the prediction formulas in the software are calibrated excluding short stay admissions from PIPDCG assignment, whether or not the user-controllable switch is set to include or exclude short stays. That is, there is only one formula in the software, based on exclusion of short stays. 2) PIPADJ = 1 HER age/sex edits for invalid diagnoses are done = 0 HER age/sex edits are not done. Default = 0. See Appendix A for HER age/sex edits. 3) MAXDIAG = maximum number of diagnoses allowed on an admission record. Code as an integer, e.g., 1 2 ... 10 11 ... Default = 10 4) YEAR2 = the year for which expenditures are being predicted. Format is yyyy. Default = 2003 If YEAR2 extends across two calendar years, the calendar year of the first month of YEAR2 should be entered. Important note: predicted expenditures are always in 1996 dollars. They are NOT adjusted for inflation, notwithstanding the value of YEAR2. YEAR2 only affects the computation of the age/sex cells, the ever disabled variable, and the age variable used in the HER age/sex edits. 5) YR2MONTH = the first month of YEAR2. Format is an integer 1-12. Default is 1. Possible values are 1 = January, 2 = February, ...., 12 = December. E.g., if YEAR2=2003 and YR2MONTH = 1, then the software assumes that expenditures are being predicted for calendar year 2003. If YEAR2 = 2003 and YR2MONTH = 5, the software assumes that expenditures are being predicted for May 2003 - April 2004. 8) FAGESEX = 1 output file includes 34 age/sex cell variables for each person = 0 age/sex variables not included in the output file. Default = 1 8) WAM = [value] Working aged multiplier for continuing beneficiaries. Multiplier used to adjust predicted expenditures and risk score of a person in working aged status. Any number can be entered. Default value = 0.21. 9) WAM_NE = [value] Working aged multiplier for new beneficiaries. Default value = 0.21. III. Notes on Program Computations Using the information on the two input files, the software assigns each continuing beneficiary a PIPDCG for Year 2, which ranges from 4 to 29. All new beneficiaries (NEWENROL = 1) are assigned a value of "missing" for the variable PIPDCG. The software replaces principal chemotherapy diagnoses by the highest-ranked secondary cancer PIPDXG on the same admission record. If an admission has a principal diagnosis of chemotherapy, but no cancer diagnosis among the secondary diagnoses, the software assigns the admission to PIPDXG 14 - breast cancer - which is the lowest-paid cancer PIPDXG. The software searches all secondary diagnoses for all admissions for HIV/AIDS diagnoses (PIPDXG 3). It assigns an admission to PIPDXG 3 if either a principal or secondary diagnosis of HIV/AIDS is present for that admission. Excluding short stays takes precedence over the chemotherapy or HIV/AIDS algorithms. That is, short stays with a principal diagnosis of chemotherapy or HIV/AIDS are excluded. For each continuing enrollee in the sample, the software computes a base relative risk score for year 2 using the payment formula that was announced by HCFA in its Report to Congress, March, 1999. The relative cost weights used in this formula are a function of the age/sex cells, EVERDISM, MCAID, and PIPDCG. This base relative risk score is called RSKSCORB. It is rounded to the nearest .001. The Base Relative Risk Score, RSKSCORB, is adjusted for working aged status, and assigned to RSKSCORA, which denotes the Relative Risk Score Adjusted. The formula used is: RSKSCORA = RSKSCORB*(1 - MSP*(1-WAM)) where WAM is the user-defined working aged multiplier parameter, set to a default value of 0.21. Note that when MSP = 0, RSKSCORA = RSKSCORB, i.e., no change. Also, when MSP = 1, RSKSCORA = WAM*RSKCSORB. The relative risk score RSKSCORA is rounded to the nearest .001. Annual predicted expenditures are calculated as: PREDEXPB = RSKSCORB*5100 and PREDEXPA = RSKSCORA*5100. These amounts are rounded to the nearest .01. Monthly predicted expenditures are calculated from the annual amounts by: MPRDEXPB = PREDEXPB/12 and MPRDEXPA = PREDEXPA/12. These amounts are rounded to the nearest .01. For each new beneficiary, the same steps are followed except that the working aged multiplier for new beneficiaries, WAM_NE, replaces the working aged multiplier for continuing beneficiaries, WAM. Also, relative risk scores for new beneficiaries are derived from a different formula that is a function of age, sex, and MCAID, but not the PIPDCGs or EVERDISM. Three additional values, CHFRSKSC (CHF-adjusted risk score), PREDEXPC, and MPRDEXPC are calculated for each beneficiary. These values may include an additional payment component if that beneficiary has been identified as having received a diagnosis of Congestive Heart Failure (CHF) from 7/01/99 to 6/30/00 (indicated by the input variable CHFYEAR3 = 1) or from 7/01/00 to 6/30/01 (indicated by the input variable CHFYEAR4 = 1). IV. Output File The program outputs a person-level SAS dataset named OUTPUT with the following variables: 1) IDNO The person's ID number. Same as input variable on PERSON and ADMISSN files. 2) SEX Same as input variable. 3) DOB Date of birth. Same as input variable. 4) OREC Original reason for Medicare entitlement. Same as input variable. 5) EVERDISM Fraction of eligible months in Year 2 that the person is in 'ever disabled' status. 6) MCAID Medicaid status. Same as input variable. 7) MSP Medicare as a secondary payer (working aged) status. Same as input variable. 8) PIPDCG A person's Principal Inpatient Diagnostic Cost Group. 9) AGE A person's age in years on the first day of YR2MONTH in YEAR2. 10) NEWENROL Indicator variable for new Medicare beneficiary. Same as input variable. 11) PREDEXPB Annualized base predicted expenditures, in 1996 dollars. 12) PREDEXPA Annualized predicted expenditures in 1996 dollars, adjusted for working aged status. 13) MPRDEXPB Monthly base predicted expenditures, in 1996 dollars. 14) MPRDEXPA Monthly predicted expenditures, in 1996 dollars, adjusted for working aged status. 15) RSKSCORB Base relative risk score. 16) RSKSCORA Relative risk score, adjusted for working aged status. 17) CHFYEAR3 Indicator variable for Congestive Heart Failure (CHF) diagnosis status between 7/01/99 and 6/30/00. Same as input variable. 18) CHFYEAR4 Indicator variable for Congestive Heart Failure (CHF) diagnosis status between 7/01/00 and 6/30/01. Same as input variable. 19) PREDEXPC Annualized predicted expenditures, in 1996 dollars, adjusted for CHF status. 20) MPRDEXPC Monthly predicted expenditures, in 1996 dollars, adjusted for CHF status. 21) CHFRSKSC Relative risk score, adjusted for CHF status. 22) W0_34 W35_44 W45_54 W55_59 W60_64 W65_69 W70_74 W75_79 W80_84 W85_89 W90_94 W95_GT M0_34 M35_44 M45_54 M55_59 M60_64 M65_69 M70_74 M75_79 M80_84 M85_89 M90_94 M95_GT W65 W66 W67 W68 W69 M65 M66 M67 M68 M69 34 age/sex variables indicating the fraction of eligible months in YEAR2 in each age/sex cell. M0_34 indicates "male, age 0 to 34" M65 indicates "male, age 65" M95_GT indicates "male, 95 years or older" W0_34 indicates "female, age 0 to 34" etc. The variables M65-M69 and W65-W69 are used for new beneficiaries instead of the variables M65_69 and W65_69. These variables are optionally output if the user-defined parameter FAGESEX is set to 1 (see Section II. above). Appendix A: HER Age/Sex Diagnosis Edits The following age/sex edits were used in model development: 1. This edit assumes that if neonatal codes occur on the record of a female 2 years or older, they are a baby's diagnoses on a mother's record. For males 2 years or older, they are assumed to be invalid. If age >= 2 and PIPDXG is in interval from 166 to 170 then for Male (SEX=1) set PIPDXG=-1 (invalid); for Female (SEX=2) set PIPDXG=130. 2. This edit specifies diagnostic categories that are inconsistent with sex (e.g., females with prostate diagnoses). For Female: if SEX=2 and (PIPDXG is one of (18,121,122) or (PIPDXG=31 and ICD9 starts with 257)) then set PIPDXG=-1 (invalid). For Male: if SEX=1 and (PIPDXG is one of (16,17,123,124,125) or (PIPDXG=31 and ICD9 starts with 256) then set PIPDXG=-1 (invalid). 3. This edit specifies pregnancy/infertility diagnoses that are inconsistent with age and/or sex: if (PIPDXG is in interval from 126 to 132 or (PIPDXG=124 and ICD9 starts with 628)) and (SEX='1' or AGE < 8 or AGE > 59) then set PIPDXG=-1 (invalid).