Skip to content
This repository has been archived by the owner on Jul 31, 2024. It is now read-only.

Absent new_lines and indentation in python data #5

Closed
nadiinchi opened this issue Dec 15, 2021 · 2 comments
Closed

Absent new_lines and indentation in python data #5

nadiinchi opened this issue Dec 15, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@nadiinchi
Copy link

Hi!

I downloaded data from AVATAR/data/data.zip and also using script AVATAR/data/download.sh, and it seems that a lot of python functions in the dataset miss new_lines and indentation. For example CodeForces/421/A/solution1.py:

n, a, b = map(int, input().split())athur = map(int, input().split())alex = map(int, input().split()) total = [1] * n for i in alex:    total[i-1] = 2 print(*total)

or CodeForces/981/A/solution1.py:

s=input()c=len(s)for i in range(len(s)-1,0,-1):    k=s[0:i+1]    if(k!=k[::-1]):        print(c)        exit()    c-=1if(c==1):    print("0")

According to my simple heuristic calculation, about 50% of python functions look like this.

Is there way to fix it? Thanks in advance for your help!

@zfj1998
Copy link

zfj1998 commented May 5, 2022

Same question. linebreakers and indentation are really important to rebuild the syntax tree.

@wasiahmad
Copy link
Owner

We have resolved the issue by re-crawling the dataset. We released the new dataset along with other updates.

@wasiahmad wasiahmad added the bug Something isn't working label Dec 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants