Autograder [Thu Apr 30 17:37:24 2020]: Received job 11785-s20_hw4p1_16_carunach@andrew.cmu.edu:1241
Autograder [Thu Apr 30 17:38:00 2020]: Success: Autodriver returned normally
Autograder [Thu Apr 30 17:38:00 2020]: Here is the output from the autograder:
---
Autodriver: Job exited with status 0
mkdir -p handin
tar xf handin.tar -C handin
tar: training.ipynb: time stamp 2020-04-30 17:36:42 is 14348.877689109 s in the future
tar: predictions.npy: time stamp 2020-04-30 17:36:42 is 14348.863304234 s in the future
tar: generated.txt: time stamp 2020-04-30 17:36:42 is 14348.863221454 s in the future
tar: generated_logits.npy: time stamp 2020-04-30 17:36:42 is 14348.863169546 s in the future
tar xf autograde.tar
AUTOLAB=1 /usr/local/depot/anaconda3/bin/python3 autograde/runner.py --module-path=./handin/
Your mean NLL for generated sequences: 2.6584229469299316
.Your mean NLL for predicting a single word: 5.555450916290283
F
=================================== FAILURES ===================================
_______________________________ test_prediction ________________________________

    def test_prediction():
        fixture = np.load(fixture_path('prediction.npz'))
        inp = fixture['inp']
        targ = fixture['out']
        out = np.load(handin_path('predictions.npy'))
        assert out.shape[0] == targ.shape[0]
        vocab = np.load(fixture_path('vocab.npy'))
        assert out.shape[1] == vocab.shape[0]
        out = log_softmax(out, 1)
        nlls = out[np.arange(out.shape[0]), targ]
        nll = -np.mean(nlls)
        print("Your mean NLL for predicting a single word: {}".format(nll))
>       assert nll < 5.4
E       assert 5.5554509 < 5.4

autograde/tests/test_prediction.py:31: AssertionError
Run time:  19.52686333656311
{"scores": {"Generation": 50.0, "Prediction": 0.0}}