Hai Pham

Ph.D. Student, Language Technologies Institute
School of Computer Science, Carnegie Mellon University


I have graduated and joined the wonderful and super talented team at Reka AI. We are hiring from the best to join us!

I was fortunate to be advised by Prof. Barnabás Póczos and Prof. David Woodruff . I have broad interests in Machine Learning and Deep Learning, with theory and applications in Natural Language Processing and Computer Vision. I am also interested in Optimization, Large-Scale Machine Learning systems and Numerical Methods.

Prior to starting my Ph.D., I received my Masters in Language Technologies at the same department. Before that, I had graduated in Computer Science in Auburn University with a focus on Big Data and Distributed Systems.

I worked as a Research Intern at at Boeing in 2017, Microsoft in Summer 2021 and 2022, and Applied Scientist Intern at Amazon AWS in Fall 2022.

Email: hai [at] reka [dot] ai

Preprints

Publications


				@article{hpham23thesis,
					title={Towards Efficient and Scalable Representation Learning},
					author={Pham, Hai},
					journal={Ph.D. Thesis, School of Computer Science, Carnegie Mellon University},
				  year={2023},
					url={Thesis_final.pdf},
				}

				@inproceedings{woodruff2021optimal,
					title={Optimal Sketching for Trace Estimation},
					author={Jiang, with Shuli and Woodruff, David P and Zhang, Richard},
					booktitle={Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS) (SPOTLIGHT)},
				  year={2021},
					url={https://arxiv.org/abs/2111.00664.pdf},
					code={https://github.com/11hifish/OptSketchTraceEst}
				}

				@inproceedings{lyu2021styleptb,
					title={StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer},
					author={Lyu*, Yiwei and Liang*, Paul Pu and Pham*, Hai and Hovy, Eduard and Póczos, Barnabás  and Salakhutdinov, Ruslan and Morency, Louis-Philippe},
					booktitle={Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL)},
					year={2021},
					url={https://arxiv.org/pdf/2104.05196.pdf},
					code={https://github.com/lvyiwei1/StylePTB/}
				}


				@inproceedings{pham2020robust,
				  title={Robust Handwriting Recognition with Limited and Noisy Data},
				  author={Pham, Hai and Setlur, Amrith and Dingliwal, Saket and Lin, Tzu-Hsiang and Póczos, Barnabás  and Huang, Kang and Li, Zhuo and Lim, Jae and McCormack, Collin and Vu, Tam},
				  booktitle={2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR)},
				  pages={301--306},
				  year={2020},
				  organization={IEEE},
				  url={https://arxiv.org/pdf/2008.08148.pdf}
				}

				@inproceedings{hoang2020revisiting,
					title={Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes},
					author={Hoang, with Quang Minh and Hoang, Trong Nghia and Woodruff, David P},
					booktitle={Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS)},
					year={2020},
					url={https://arxiv.org/pdf/2011.08432.pdf},
					code={https://github.com/hqminh/gp_sketch_nips}
				}

				@inproceedings{pham2019found,
					title={Found in translation: Learning robust joint representations by cyclic translations between modalities},
					author={Pham*, Hai and Liang*, Paul Pu and Manzini, Thomas and Morency, Louis-Philippe and Póczos, Barnabás },
					booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
					volume={33},
					number={01},
					pages={6892--6899},
					year={2019},
					url={https://arxiv.org/pdf/1812.07809.pdf},
					code={https://github.com/hainow/MCTN}
				}

				@inproceedings{pham2018seq2seq2sentiment,
					title={Seq2seq2sentiment: Multimodal sequence to sequence models for sentiment analysis},
					author={Pham, Hai and Manzini, Thomas and Liang, Paul Pu and Póczos, Barnabás },
					booktitle={Proceedings of Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML)},
					year={2018},
					url={https://arxiv.org/pdf/1807.03915.pdf}
				}

				@inproceedings{zhou2015sfmapreduce,
					title={Sfmapreduce: An optimized mapreduce framework for small files},
					author={Zhou, Fang and Pham, Hai and Yue, Jianhui and Zou, Hao and Yu, Weikuan},
					booktitle={2015 IEEE International Conference on Networking, Architecture and Storage (NAS)},
					pages={23--32},
					year={2015},
					organization={IEEE},
					url={https://www.cs.fsu.edu/~yuw/pubs/2015-NAS-Yu.pdf}
				}

				@article{pham2016assessment,
					title={Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store},
					author={Pham, Hai},
					year={2016},
					journal={Master's Thesis, Computer Science, Auburn University},
					url={https://etd.auburn.edu/bitstream/handle/10415/5135/hpham%20-%20Grad%20Thesis%20-%20final.pdf?sequence=2&isAllowed=y}
				}

			

Teaching

Lead Instructor: 11-695 AI Engineering, Spring 2020
Lead Instructor: 11-695 AI Engineering, Spring 2019
Co-Instructor: 11-695 Competitive Engineering, Spring 2018