-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GEN-178] Update to pandas 2.0 and upgraded synapseclient #559
base: develop
Are you sure you want to change the base?
Conversation
…554) * convert the clinical template columns from a set to a list * add test function for Clinical.preprocess
* update method to concatenate columns * refactor _update_table by separating it to several functions * add test cases for new functions
…ssage (#558) * update method to concatenate columns * revert to origincal join function but add checks for empty df * Update test_load.py remove unused package * refactor _update_table * add test cases for seperated functions * Update test_load.py remove unused modules * separate out oncotree code validation, sort oncotree codes * add tests with duplicated unmapped codes * make oncotree list more readable --------- Co-authored-by: danlu1 <[email protected]> Co-authored-by: Dan Lu <[email protected]>
🎉 All dependencies have been resolved ! |
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥 LGTM! Thanks for these changes, i'm going to pre-approve since I think all my comments are pretty minor.
col_order: List[str], | ||
allupdates: pd.DataFrame, | ||
to_delete_rows: pd.DataFrame, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: return type and inconsistent parameter naming (all_updates)
dataset(pd.DataFrame): A dataframe | ||
new_dataset: The re-ordered new dataset | ||
primary_key_cols (list): Column(s) that make up the primary key | ||
primary_key: The column name of the primary_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Make sure arguments match
@@ -472,6 +472,68 @@ def process_steps( | |||
newClinicalDf.to_csv(newPath, sep="\t", index=False) | |||
return newPath | |||
|
|||
def _validate_oncotree_code_mapping( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: these don't use self
so you might be able to add these as @classmethod
] | ||
return unmapped_oncotrees.index | ||
|
||
def _validate_oncotree_code_mapping_message( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about classmethod
pyranges==0.0.115 | ||
# known working version 6.0 | ||
PyYAML>=5.1 | ||
synapseclient>=2.7.0,<3.0.0 | ||
synapseclient>=3.0.0,<4.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason <4.0.0? Would it be worth it to bump it all the way up to the 4 series?
@@ -29,8 +29,8 @@ project_urls = | |||
[options] | |||
packages = find: | |||
install_requires = | |||
synapseclient>=2.7.0, <3.0.0 | |||
pandas>=1.0,<1.5.0 | |||
synapseclient>=3.0.0, <4.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here about synapseclient, also either in this ticket or another ticket, bump the python versions we support.
Purpose: This will be the main branch to hold all of the individual update pandas PRs mainly because of the pandas PRs being interdependent on each other to work, and to have a branch where all of the changes can be tested and validated (once merged in) without affecting other genie development work on
develop
branch.