Professor Stephen Gould
Areas of expertise
- Artificial Intelligence And Image Processing 0801
- Computer Vision 080104
- Adaptive Agents And Intelligent Robotics 080101
- Optimisation 010303
- Probability Theory 010404
Research interests
I have broad interests in computer and robotic vision, machine learning, probabilistic graphical models, and optimization. My main research focus is on the application of machine learning techniques (specifically, structured probablistic models and, more recently, deep learning) to geometric and semantic scene understanding. I am also interested in seeing research get used; I collaborate with industry and have previously been involved in founding start-up companies.
Biography
Stephen Gould is a Professor of Computer Science at the Australian National University. He is a former ARC Postdoctoral Fellow and Microsoft Faculty
Fellow, and is a Contributed Researcher to the Data61 Machine Learning group and a Chief Investigator with the Australian
Research Council Centre of Excellence in Robotic Vision. Stephen received his BSc degree in mathematics and computer
science and BE degree in electrical engineering from the University of Sydney in 1994 and 1996, respectively. He received
his MS degree in electrical engineering from Stanford University in 1998. He then worked in industry for a number of years
where he co-founded Sensory Networks, which sold to Intel in 2013. In 2005 he returned to PhD studies and earned his PhD
degree in Electrical Engineering from Stanford University in 2010. Stephen has broad interests in the areas of computer and
robotic vision, machine learning, probabilistic graphical models, and optimization. His main research focus is on automatic
semantic and geometric understanding of images and videos.
Publications
- Gould, S, Hartley, R & Campbell, D 2021, 'Deep Declarative Networks', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 3988-4004.
- Ben Shabat, Y & Gould, S 2020, 'DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares', 16th European Conference on Computer Vision, ECCV 2020, ed. A. Vedaldi, H. Bischof, T. Brox & J-M. Frahm, Springer, Cham, Switzerland, pp. 20-34.
- Ali Akbarian, M, Saleh, F, Salzmann, M et al. 2020, 'A stochastic conditioning scheme for diverse human motion prediction', 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, IEEE, United States, pp. 5222-5231.
- Cherian, A & Gould, S 2019, 'Second-order Temporal Pooling for Action Recognition', International Journal of Computer Vision, vol. 127, no. 4, pp. 340-362.
- Fonseca De Santa Cruz Oliveira, R, Fernando, B, Cherian, A et al. 2019, 'Visual Permutation Learning', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 12, pp. 3100-3114.
- Anderson, P, He, X, Buehler, C et al. 2018, 'Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering', 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, United States, pp. 6077-6086.
- Saleh, F, Ali Akbarian, M, Salzmann, M et al. 2018, 'Incorporating Network Built-in Priors in Weakly-Supervised Semantic Segmentation', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1382-1396.
- Anderson, P, Fernando, B, Johnson, M et al. 2017, 'Guided open vocabulary image captioning with constrained beam search', Conference on Empirical Methods in Natural Language Processing, EMNLP2017, ed. Martha Palmer, Rebecca Hwa, Sebastian Riedel, Association for Computational Linguistics, United States, pp. 936-945.
- Cherian, A, Koniusz, P & Gould, S 2017, 'Higher-order pooling of cnn features via kernel linearization for action recognition', 17th IEEE Winter Conference on Applications of Computer Vision, WACV 2017, Institute of Electrical and Electronics Engineers (IEEE Inc), Piscataway, New Jersey, US, pp. 130-138.
- Fernando, B & Gould, S 2017, 'Discriminatively Learned Hierarchical Rank Pooling Networks', International Journal of Computer Vision, vol. 124, no. 3, pp. 335-355.
- Cherian, A, Fernando, B, Harandi, M et al. 2017, 'Generalized rank pooling for activity recognition', CVPRW 2017 - 30th IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, ed. Lisa O'Conner, IEEE, United States, pp. 1581-1590.
- Fernando, B, Bilen, H, Gavves, E et al. 2017, 'Self-supervised video representation learning with odd-one-out networks', CVPRW 2017 - 30th IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, ed. Lisa O'Conner, IEEE, United States, pp. 5729-5738.
- Santa Cruz, R, Fernando, B, Cherian, A & Gould, S 2017, 'DeepPermNet: Visual permutation learning', 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, ed. Lisa O’Conner, IEEE, Peer review, pp. 6044-6052.
- Fernando, B, Anderson, P, Hutter, M et al. 2016, 'Discriminative hierarchical rank pooling for activity recognition', IEEE Conference on Computer Vision and Pattern Recognition, 2016, IEEE, USA, pp. 1924-1932.
- Bilen, H, Fernando, B, Gavves, E et al. 2016, 'Dynamic image networks for action recognition', IEEE Conference on Computer Vision and Pattern Recognition, 2016, IEEE, USA, pp. 3034-3042.
- Khan, A, Gould, S & Salzmann, M 2016, 'Deep convolutional neural networks for human embryonic cell counting', European Conference on Computer Vision, ECCV 2016, ed. Hua. G, Jegou. H, Springer International Publishing AG, Switzerland, pp. 339-348.
- Fernando, B & Gould, S 2016, 'Learning End-to-end Video Classification with Rank-Pooling', 33rd International conference on Machine Learning 2016, ed. Maria Florina Balcan, Kilian Q. Weinberger, JMLR - Journal of Machine Learning, Online, pp. 10pp.
- Khan, A, Gould, S & Salzmann, M 2016, 'Segmentation of developing human embryo in time-lapse microscopy', International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2016, IEEE, TBC, pp. 930-934.
- Anderson, P, Fernando, B, Johnson, M et al. 2016, 'SPICE: Semantic propositional image caption evaluation', 14th European Conference on Computer Vision, ECCV 2016, Springer Verlag, Berlin, pp. 382-398.
- Sadeghi Sokeh, H, Gould, S & Renz, J 2015, 'Determining Interacting Objects in Human-Centric Activities via Qualitative Spatio-Temporal Reasoning', 12th Asian Conference on Computer Vision, ACCV 2014, ed. H.Reid, I.Yang, Springer, TBC, pp. 550-563.
- Gould, S 2015, 'Learning Weighted Lower Linear Envelope Potentials in Binary Markov Random Fields', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 7, pp. 1336-1346.
- Rezatofighi, S, Gould, S, Vo, B et al. 2015, 'Multi-target tracking with time-varying clutter rate and detection profile: Application to time-lapse cell microscopy sequences', IEEE Transactions on Medical Imaging, vol. 34, no. 6, pp. 1336-1348.
- Khan, A, Gould, S & Salzmann, M 2015, 'A linear chain markov model for detection and localization of cells in early stage embryo development', IEEE Winter Conference on Applications of Computer Vision, WACV 2015, IEEE, TBC, pp. 526-533.
- Khan, A, Gould, S & Salzmann, M 2015, 'Automated monitoring of human embryonic cells up to the 5-cell stage in time-lapse microscopy images', IEEE International Symposium on Biomedical Imaging, ISBI 2015, IEEE Computer Society, TBC, pp. 389-393.
- Liu, B, He, X & Gould, S 2015, 'Multi-class Semantic Video Segmentation with Exemplar-Based Object Reasoning', IEEE Winter Conference on Applications of Computer Vision, WACV 2015, IEEE, TBC, pp. 1014-1021.
- Pham, T, Reid, I, Latif, Y et al 2015, 'Hierarchical Higher-Order Regression Forest Fields: An Application to 3D Indoor Scene Labelling', IEEE International Conference on Computer Vision (ICCV/ICCVW 2015), IEEE Computer Society, USA, pp. 2246-2254.
- Gould, S & He, X 2014, 'Scene understanding by labeling pixels', Communications of the Association for Computing Machinery, vol. 57, no. 11, pp. 68-77.
- Moussavi, F, Wang, Y, Lorenzen, P et al 2014, 'A Unified Graphical Models Framework for Automated Mitosis Detection in Human Embryos', IEEE Transactions on Medical Imaging, vol. 33, no. 7, pp. 1551-1562.
- Gould, S, Zhao, J, He, X et al 2014, 'Superpixel Graph Label Transfer with Learned Distance Metric', Lecture Notes in Computer Science (LNCS), vol. 8689, no. 2014, pp. 632-647.
- He, X & Gould, S 2014, 'An Exemplar-Based CRF for Multi-instance Object Segmentation', 27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, IEEE, Columbus USA, pp. 296-303.
- Liu, B, He, X & Gould, S 2014, 'Joint semantic and geometric segmentation of videos with a stage model', 2014 IEEE Winter Conference on Applications of Computer Vision, WACV 2014, IEEE Computer Society, USA, pp. 737-744.
- Wang, H, Gould, S & Koller, D 2013, 'Discriminative learning with latent variables for cluttered indoor scene understanding', Communications of the Association for Computing Machinery, vol. 56, no. 4, pp. 92-99.
- Rezatofighi, S, Gould, S, Vo, B et al. 2013, 'A Multiple Model Probability Hypothesis Density Tracker for Time-Lapse Cell Microscopy Sequences', Lecture Notes in Computer Science (LNCS), vol. 7917, no. 2013, pp. 110-122.
- Sadeghi Sokeh, H, Gould, S & Renz, J 2013, 'Efficient Extraction and Representation of Spatial Information from Video Data', 23rd International Joint Conference on Artificial Intelligence, IJCAI 2013, AAAI Press, USA, pp. 1076-1082.
- Rezatofighi, S, Pitkeathly, W, Gould, S et al. 2013, 'A Framework for Generating Realistic Synthetic Sequences of Total Internal Reflection Fluorescence Microscopy Images', International Symposium on biomedical Imaging ISBI 2013, IEEE, USA, pp. 157-160.
- He, X & Gould, S 2013, 'Multi-instance object segmentation with exemplars', 14th IEEE International Conference on Computer Vision Workshops, ICCVW 2013, IEEE, Sydney, NSW, pp. 1-4.
- Sadeghi Sokeh, H & Gould, S 2012, 'Towards unsupervised semantic segmentation of street scenes from motion cues', Image and Vision Computing New Zealand Conference (IVCNZ 2012), Association for Computing Machinery Inc (ACM), Dunedin, pp. 232-237.
- Gould, S 2012, 'DARWIN: A framework for machine learning and computer vision research and development', Journal of Machine Learning Research, vol. 13, pp. 3533-3537.
- Park, K & Gould, S 2012, 'On learning higher-order consistency potentials for multi-class pixel labeling', 12th European Conference on Computer Vision (ECCV 2012), ed. A Fusiello, V Murino & R Cucchiara, Springer, Berlin, pp. 202-215.
- Gould, S 2012, 'Multiclass pixel labeling with non-local matching constraints', IEEE Conference on Computer Vision and Pattern Recognition CVPR 2012, Institute of Electrical and Electronics Engineers (IEEE Inc), Piscataway, NJ USA, pp. 2783-2790.
- Gould, S & Zhang, Y 2012, 'PatchMatchGraph: Building a graph of dense patch correspondences for label transfer', 12th European Conference on Computer Vision (ECCV 2012), ed. A Fusiello, V Murino & R Cucchiara, Springer, Berlin, pp. 439-452.
- Rezatofighi, S, Gould, S, Hartley, R et al. 2012, 'Application of the IMM-JPDA filter to multiple target tracking in total internal reflection fluorescence microscopy images', MICCAI 2012 - 15th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Berlin, pp. 357-364.
- Yang, D, Gould, S & Hutter, M 2012, 'A Noise Tolerant Watershed Transformation with Viscous Force for Seeded Image Segmentation', Lecture Notes in Computer Science (LNCS), vol. 7724, pp. 775-789.
- Gould, S 2011, 'Max-margin Learning for Lower Linear Envelope Potentials in Binary Markov Random Fields', International Conference on Machine Learning (ICML 2011), Conference Organising Committee, TBC, p. 8.
- Rivera, P & Gould, S 2011, 'Simultaneous Multi-class Pixel Labeling over Coherent Image Sets', 2011 International Conference on Digital Image Computing: Techniques and Applications(DICTA 2011), ed. A P Bradley, Y Gal, P Jackway and O Salvado, IEEE, United States, p. 8.
- Jojic, V, Gould, S & Koller, D 2010, 'Accelerated Dual Decomposition for MAP Inference', International Conference on Machine Learning (ICML 2010), ed. Johannes F¨urnkranz, OmniPress, Wisconsin, pp. 503-510.
- Packer, B, Gould, S & Koller, D 2010, 'A Unified Contour-Pixel Model for figure-ground Segmentation', European Conference on Computer Vision (ECCV 2010), ed. K. Daniilidis, P. Maragos, N. Paragios, Springer, Berlin, Heidelberg, pp. 338-351.
- Wang, H, Gould, S & Koller, D 2010, 'Discriminative Learning with Latent Variables for Cluttered Indoor Scene Understanding', European Conference on Computer Vision (ECCV 2010), ed. K. Daniilidis, P. Maragos, N. Paragios, Springer, Berlin, Heidelberg, pp. 435-449.
- Liu, B, Gould, S & Koller, D 2010, 'Single Image Depth Estimation from Predicted Semantic Labels', Computer Vision and Pattern Recognition Conference (CVPR 2010), ed. Conference Program Committee, Institute of Electrical and Electronics Engineers (IEEE Inc), San Francisco, USA, pp. 1253-1260.
- Gould, S, Gao, T & Koller, D 2009, 'Region-based Segmentation and Object Detection', Conference on Advances in Neural Information Processing Systems (NIPS 2009), ed. Y. Bengio, D. Schuurmans, J. Lafferty, C. Williams, A. Culotta, MIT Press, Vancouver, Canada.
- Gould, S, Fulton, R & Koller, D 2009, 'Decomposing a Scene into Geometric and Semantically Consistent Regions', IEEE International Conference on Computer Vision (ICCV 2009), ed. Conference Program Committee, Institute of Electrical and Electronics Engineers (IEEE Inc), Piscataway USA, pp. 1-8.
- Gould, S, Amat, F & Koller, D 2009, 'Alphabet SOUP: A Framework for Approximate Energy Minimization', Computer Vision and Pattern Recognition Conference (CVPR 2009), ed. Pat Flynn, Eric Mortensen, Institute of Electrical and Electronics Engineers (IEEE Inc), Miami, USA, pp. 903-910.
- Quigley, M, Batra, S, Gould, S et al 2009, 'High-Accuracy 3D Sensing for Mobile Manipulation: Improving Object Detection and Door Opening', IEEE International Conference on Robotics and Automation ICRA 2009, ed. Conference Program Committee, Institute of Electrical and Electronics Engineers (IEEE Inc), Kobe, Japan, pp. 2816-2822.
- Duchi, J, Gould, S & Koller, D 2008, 'Projected Subgradient Methods for Learning Sparse Gaussians', Conference on Uncertainty in Artificial Intelligence (UAI 2008), ed. David McAllester, Petri Myllymaki, AUAI Press, Corvallis, Oregon, pp. 145-152.
- Gould, S, Rodgers, J, Cohen, D et al 2008, 'Multi-class Segmentation with Relative Location Prior', International Journal of Computer Vision, vol. 80, pp. 300-316.
- Gould, S, Baumstarck, P, Quigley, M et al 2008, 'Integrating Visual and Range Data for Robotic Object Detection', 10th European Conference on Computer Vision (ECCV 2008), ed. Forsyth, David; Torr, Philip; Zisserman, Andrew (Eds.), Springer, France, pp. 1-12.
- Elidan, G & Gould, S 2008, 'Learning Bounded Treewidth Bayesian Networks', Journal of Machine Learning Research, vol. 9, pp. 2699-2731.
- Geremy, H, Gould, S, Saxena, A et al 2008, 'Cascaded Classification Models: Combining Models for Holistic Scene Understanding', Conference on Advances in Neural Information Processing Systems (NIPS 2008), ed. Daphne Koller, Yoshua Bengio, Dale Schuurmans, Leon Bottou, and Aron Culotta, MIT Press, Vancouver, Canada, pp. 641-648.
- Gould, S, Arfvidsson, J, Kaehler, A et al 2007, 'Peripheral-foveal vision for real-time object recognition and tracking in video', International Joint Conference on Artificial Intelligence (IJCAI 2007), ed. M.M. Veloso, Morgan Kaufmann Publishers, Inc., USA, pp. 2118-2121.
Projects and Grants
Grants information is drawn from ARIES. To add or update Projects or Grants information please contact your College Research Office.
- Towards in-vehicle situation awareness using visual and audio sensors (Primary Investigator)
- Declarative Networks: Towards Robust and Explainable Deep Learning (Primary Investigator)
- Centre of Excellence in Robotic Vision (CRV) (Secondary Investigator)
- Imaging-Based discovery and tracking of cellular structures during early embryo development (Primary Investigator)
- Learning Clique Potentials for High-Order Graphical Models (Primary Investigator)