Skip to content
Snippets Groups Projects
Commit 74eab182 authored by Konstantin Julius Lotzgeselle's avatar Konstantin Julius Lotzgeselle :speech_balloon:
Browse files

Minor changes

parent f6eb2361
No related branches found
No related tags found
No related merge requests found
......@@ -64,8 +64,7 @@ def get_prepared_data(source_data_path: str,
def count_words(string: str) -> int:
return len(string.split())
get_prepared_data("data/tokenizer-data/news-commentary-v11.de", "data/tokenizer-data/news-commentary-v11.en", seperator_symbol=None)
# get_prepared_data("data/tokenizer-data/news-commentary-v11.de", "data/tokenizer-data/news-commentary-v11.en")
def create_tokenizers(source_data_path: str, target_data_path: str, source_language: str, target_language: str):
# setting the unknown token (e.g. for emojis)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment