While improving computer system performance has always been important, continuing to do so under stringent power constraints (i.e. dark silicon) is increasingly challenging, yet critical for many emerging applications. Modern-day superscalar (out-of-order) processors deliver high performance but incur high design complexity, high power consumption and large chip area. Low-power (in-order) cores on the other hand are inherently more power and cost-efficient, but their major disadvantage is limited performance. The ideal processor microarchitecture delivers high performance at low power and low cost, which might enable new applications in both the server and mobile/embedded spaces. This project explores and proposes enhancements at the core level as well as at the chip level to improve performance in a power- and cost-efficient way. The project makes contributions to (1) accelerate single-thread performance, (2) performance and power prediction, (3) scheduling chip-level performance, and (4) workload-specific optimization.