How F-Score works in case of Text Categorization ? When we create model and upload the Training and test data, based on the test data entries F-Score is calculated. What happens to F-Score when new emails start coming ( in context to Email Channel) and they sometime map correctly or incorrectly to the topic ? Does F-Score change automatically if incoming email is not able to map to the topic ?
When new email comes and it goes for manual triaging, any case created or a reply sent by CSR causes feedback record to be generated which goes and sits in Training data tab of email channel. After you verify these feedback records and build the model, only then F-Score changes. F-score is harmonic mean of Recall and Precision of a given model and ideally should increases when you provide variety of labels(topics) and variety of text. It may not improve if you are providing same text and labels all the time.
Both train and test should be good, accurate, non-junk data. You may choose not to provide anything in there and model build process will automatically take the random samples. Typically a 70:30 train:test split is done for ML models.
Model will generate F-score based on test data only. The process will build model with 'train' data and then test the generated model against 'test' data. Truth table (TP, FP) thus generated will be used to calculate scores. Specify 'Test' if you want to be sure that the F-score generated is according to 'your' test data and not some random sample.