The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.
| Published in | American Journal of Artificial Intelligence (Volume 9, Issue 2) |
| DOI | 10.11648/j.ajai.20250902.26 |
| Page(s) | 272-280 |
| Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
| Copyright |
Copyright © The Author(s), 2025. Published by Science Publishing Group |
Deterministic Learning, σ-Regularization, AI Energy-efficient Computation, Cekirge Method, GPT Benchmarking
| [1] | Cekirge, H. M., Algebraic σ-Based (Cekirge) Model for Deterministic and Energy-Efficient Unsupervised Machine Learning, AJAI, 2025. |
| [2] | Cekirge, H. M., An Alternative Way of Determining Biases and Weights for the Training of Neural Networks, AJAI, 2025. |
| [3] | Cekirge, H. M., Cekirge’s σ-Based ANN Model for Deterministic, Energy-Efficient, Scalable AI with Large-Matrix Capability, AJAI, 2025. |
| [4] | Cekirge, H. M., Tuning the Training of Neural Networks by Using the Perturbation Technique, AJAI, 2025. |
| [5] | Cekirge, H. M., Cekirge_Perturbation_Report_v4. Zenodo, 2025. |
| [6] | Friston, K., Free-Energy Principle in Cognition and AI, Nature Neuroscience, 22(2), 2019. |
| [7] | Schmidhuber, J., Deep Learning in Neural Networks: An Overview, Neural Networks, 61, 85-117, 2015. |
| [8] | Zhuge, Y., Han, J. and Li Z., Spectral Regularization in Large-Scale Transformer Training for Energy-Efficient Convergence, IEEE Transactions on Neural Networks and Learning Systems, 35(7), 8432-8447, 2024. |
| [9] | Benton, R., Spectral Stabilization and Regularization in Large Transformer Architectures, arXiv: 2304.10211, 2023. |
| [10] | Lee, D. and Fischer, A., Deterministic Matrix-Inversion Learning for Stable Transformer Layers, Nature Machine Intelligence, 7(3), 215-228, 2025. |
| [11] | Patel, K., Ahmed, S. and Rana, P., Low-Entropy Energy Models for Reproducible AI Systems: Toward Analytical Convergence, Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 1021-1032, 2025. |
| [12] | Hinton, G., Efficient Representations and Energy Constraints in Learning Systems, AI Magazine, 45(1), 2024. |
| [13] | Rumelhart, D. E., Hinton, G. E. and Williams R. J., Learning Representations by Back-Propagation of Errors, Nature, 323(6088), 533-536, 1986. |
| [14] | LeCun, Y., Pathways toward Energy-Based Models, Meta AI Research Notes, 2022. |
| [15] | Nguyen, T. and Raginsky, M., Scaling Laws and Deterministic Limits in High-Dimensional Learning Dynamics, Journal of Machine Learning Research, 25(118), 1-32, 2024. |
APA Style
Cekirge, H. M. (2025). Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. American Journal of Artificial Intelligence, 9(2), 272-280. https://doi.org/10.11648/j.ajai.20250902.26
ACS Style
Cekirge, H. M. Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. Am. J. Artif. Intell. 2025, 9(2), 272-280. doi: 10.11648/j.ajai.20250902.26
@article{10.11648/j.ajai.20250902.26,
author = {Huseyin Murat Cekirge},
title = {Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines},
journal = {American Journal of Artificial Intelligence},
volume = {9},
number = {2},
pages = {272-280},
doi = {10.11648/j.ajai.20250902.26},
url = {https://doi.org/10.11648/j.ajai.20250902.26},
eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250902.26},
abstract = {The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.},
year = {2025}
}
TY - JOUR T1 - Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines AU - Huseyin Murat Cekirge Y1 - 2025/11/28 PY - 2025 N1 - https://doi.org/10.11648/j.ajai.20250902.26 DO - 10.11648/j.ajai.20250902.26 T2 - American Journal of Artificial Intelligence JF - American Journal of Artificial Intelligence JO - American Journal of Artificial Intelligence SP - 272 EP - 280 PB - Science Publishing Group SN - 2639-9733 UR - https://doi.org/10.11648/j.ajai.20250902.26 AB - The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems. VL - 9 IS - 2 ER -