Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEN-178] Update to pandas 2.0 and upgraded synapseclient #559

Draft
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

rxu17
Copy link
Contributor

@rxu17 rxu17 commented Apr 16, 2024

Purpose: This will be the main branch to hold all of the individual update pandas PRs mainly because of the pandas PRs being interdependent on each other to work, and to have a branch where all of the changes can be tested and validated (once merged in) without affecting other genie development work on develop branch.

rxu17 and others added 6 commits April 16, 2024 15:57
…554)

* convert the clinical template columns from a set to a list

* add test function for Clinical.preprocess
* update method to concatenate columns

* refactor _update_table by separating it to several functions

* add test cases for new functions
…ssage (#558)

* update method to concatenate columns

* revert to origincal join function but add checks for empty df

* Update test_load.py

remove unused package

* refactor _update_table

* add test cases for seperated functions

* Update test_load.py

remove unused modules

* separate out oncotree code validation, sort oncotree codes

* add tests with duplicated unmapped codes

* make oncotree list more readable

---------

Co-authored-by: danlu1 <[email protected]>
Co-authored-by: Dan Lu <[email protected]>
Copy link

dpulls bot commented Apr 19, 2024

🎉 All dependencies have been resolved !

Copy link

sonarcloud bot commented Apr 19, 2024

Quality Gate Passed Quality Gate passed

Issues
32 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 LGTM! Thanks for these changes, i'm going to pre-approve since I think all my comments are pretty minor.

col_order: List[str],
allupdates: pd.DataFrame,
to_delete_rows: pd.DataFrame,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: return type and inconsistent parameter naming (all_updates)

Comment on lines +228 to +231
dataset(pd.DataFrame): A dataframe
new_dataset: The re-ordered new dataset
primary_key_cols (list): Column(s) that make up the primary key
primary_key: The column name of the primary_key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Make sure arguments match

@@ -472,6 +472,68 @@ def process_steps(
newClinicalDf.to_csv(newPath, sep="\t", index=False)
return newPath

def _validate_oncotree_code_mapping(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: these don't use self so you might be able to add these as @classmethod

]
return unmapped_oncotrees.index

def _validate_oncotree_code_mapping_message(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about classmethod

pyranges==0.0.115
# known working version 6.0
PyYAML>=5.1
synapseclient>=2.7.0,<3.0.0
synapseclient>=3.0.0,<4.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason <4.0.0? Would it be worth it to bump it all the way up to the 4 series?

@@ -29,8 +29,8 @@ project_urls =
[options]
packages = find:
install_requires =
synapseclient>=2.7.0, <3.0.0
pandas>=1.0,<1.5.0
synapseclient>=3.0.0, <4.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about synapseclient, also either in this ticket or another ticket, bump the python versions we support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants