Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines

Huseyin Murat Cekirge

doi:doi:10.11648/j.ajai.20250902.26

Research Article |

| Peer-Reviewed

Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines

Huseyin Murat Cekirge^*

Published in American Journal of Artificial Intelligence (Volume 9, Issue 2)

Received: 3 November 2025 Accepted: 14 November 2025 Published: 28 November 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.

Published in	American Journal of Artificial Intelligence (Volume 9, Issue 2)
DOI	10.11648/j.ajai.20250902.26
Page(s)	272-280
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Deterministic Learning, σ-Regularization, AI Energy-efficient Computation, Cekirge Method, GPT Benchmarking

References

[1]	Cekirge, H. M., Algebraic σ-Based (Cekirge) Model for Deterministic and Energy-Efficient Unsupervised Machine Learning, AJAI, 2025. https://doi.org/10.11648/j.ajai.20250902.20
[2]	Cekirge, H. M., An Alternative Way of Determining Biases and Weights for the Training of Neural Networks, AJAI, 2025. https://doi.org/10.11648/j.ajai.20250902.14
[3]	Cekirge, H. M., Cekirge’s σ-Based ANN Model for Deterministic, Energy-Efficient, Scalable AI with Large-Matrix Capability, AJAI, 2025. https://doi.org/10.11648/j.ajai.20250902.21
[4]	Cekirge, H. M., Tuning the Training of Neural Networks by Using the Perturbation Technique, AJAI, 2025. https://doi.org/10.11648/j.ajai.20250902.11
[5]	Cekirge, H. M., Cekirge_Perturbation_Report_v4. Zenodo, 2025. https://doi.org/10.5281/zenodo.17393651
[6]	Friston, K., Free-Energy Principle in Cognition and AI, Nature Neuroscience, 22(2), 2019. https://doi.org/10.1038/s41593-018-0310-6
[7]	Schmidhuber, J., Deep Learning in Neural Networks: An Overview, Neural Networks, 61, 85-117, 2015. https://doi.org/10.1016/j.neunet.2014.09.003
[8]	Zhuge, Y., Han, J. and Li Z., Spectral Regularization in Large-Scale Transformer Training for Energy-Efficient Convergence, IEEE Transactions on Neural Networks and Learning Systems, 35(7), 8432-8447, 2024. https://doi.org/10.1109/TNNLS.2024.3321459
[9]	Benton, R., Spectral Stabilization and Regularization in Large Transformer Architectures, arXiv: 2304.10211, 2023.
[10]	Lee, D. and Fischer, A., Deterministic Matrix-Inversion Learning for Stable Transformer Layers, Nature Machine Intelligence, 7(3), 215-228, 2025. https://doi.org/10.1038/s42256-025-00934-0
[11]	Patel, K., Ahmed, S. and Rana, P., Low-Entropy Energy Models for Reproducible AI Systems: Toward Analytical Convergence, Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 1021-1032, 2025. https://doi.org/10.1609/aaai.v39i1.30567
[12]	Hinton, G., Efficient Representations and Energy Constraints in Learning Systems, AI Magazine, 45(1), 2024. https://doi.org/10.1609/aimag.v45i1.29517
[13]	Rumelhart, D. E., Hinton, G. E. and Williams R. J., Learning Representations by Back-Propagation of Errors, Nature, 323(6088), 533-536, 1986. https://doi.org/10.1038/323533a0
[14]	LeCun, Y., Pathways toward Energy-Based Models, Meta AI Research Notes, 2022.
[15]	Nguyen, T. and Raginsky, M., Scaling Laws and Deterministic Limits in High-Dimensional Learning Dynamics, Journal of Machine Learning Research, 25(118), 1-32, 2024. http://jmlr.org/papers/v25/nguyen24a.html

Cite This Article

Plain Text BibTeX RIS

APA Style

Cekirge, H. M. (2025). Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. American Journal of Artificial Intelligence, 9(2), 272-280. https://doi.org/10.11648/j.ajai.20250902.26

Copy | Download

ACS Style

Cekirge, H. M. Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. Am. J. Artif. Intell. 2025, 9(2), 272-280. doi: 10.11648/j.ajai.20250902.26

Copy | Download

AMA Style

Cekirge HM. Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. Am J Artif Intell. 2025;9(2):272-280. doi: 10.11648/j.ajai.20250902.26

Copy | Download

@article{10.11648/j.ajai.20250902.26,
  author = {Huseyin Murat Cekirge},
  title = {Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {2},
  pages = {272-280},
  doi = {10.11648/j.ajai.20250902.26},
  url = {https://doi.org/10.11648/j.ajai.20250902.26},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250902.26},
  abstract = {The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines
AU  - Huseyin Murat Cekirge
Y1  - 2025/11/28
PY  - 2025
N1  - https://doi.org/10.11648/j.ajai.20250902.26
DO  - 10.11648/j.ajai.20250902.26
T2  - American Journal of Artificial Intelligence
JF  - American Journal of Artificial Intelligence
JO  - American Journal of Artificial Intelligence
SP  - 272
EP  - 280
PB  - Science Publishing Group
SN  - 2639-9733
UR  - https://doi.org/10.11648/j.ajai.20250902.26
AB  - The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.
VL  - 9
IS  - 2
ER  -

Copy | Download

Author Information

Huseyin Murat Cekirge

Department of Mechanical Engineering, The City College of New York (CUNY), New York, USA

Contact Email

http://orcid.org/0000-0001-8075-2306

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Cekirge, H. M. (2025). Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. American Journal of Artificial Intelligence, 9(2), 272-280. https://doi.org/10.11648/j.ajai.20250902.26

Copy | Download

ACS Style

Cekirge, H. M. Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. Am. J. Artif. Intell. 2025, 9(2), 272-280. doi: 10.11648/j.ajai.20250902.26

Copy | Download

AMA Style

Cekirge HM. Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines. Am J Artif Intell. 2025;9(2):272-280. doi: 10.11648/j.ajai.20250902.26

Copy | Download

@article{10.11648/j.ajai.20250902.26,
  author = {Huseyin Murat Cekirge},
  title = {Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {2},
  pages = {272-280},
  doi = {10.11648/j.ajai.20250902.26},
  url = {https://doi.org/10.11648/j.ajai.20250902.26},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250902.26},
  abstract = {The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Deterministic σ-Regularized Benchmarking of the Cekirge Model Against GPT-Transformer Baselines
AU  - Huseyin Murat Cekirge
Y1  - 2025/11/28
PY  - 2025
N1  - https://doi.org/10.11648/j.ajai.20250902.26
DO  - 10.11648/j.ajai.20250902.26
T2  - American Journal of Artificial Intelligence
JF  - American Journal of Artificial Intelligence
JO  - American Journal of Artificial Intelligence
SP  - 272
EP  - 280
PB  - Science Publishing Group
SN  - 2639-9733
UR  - https://doi.org/10.11648/j.ajai.20250902.26
AB  - The Cekirge Method introduces a deterministic, algebraic paradigm for artificial intelligence that replaces stochastic gradient descent—and related iterative schemes such as gradient descent and conjugate gradient descent—with a single closed-form computation. Rather than updating parameters through iterative optimization, the method computes the optimal mapping between contextual inputs and target outputs analytically. This closed-form formulation eliminates randomness, guarantees reproducibility across hardware platforms, and avoids the variability inherent in gradient-based training. σ-Regularization ensures that all matrices involved in the computation remain invertible and well-conditioned, allowing the system to operate reliably even when contextual structures exhibit high correlation or near-singularity. Benchmark comparisons with GPT-type transformer architectures show that the deterministic mapping achieves comparable accuracy while requiring far fewer computational steps. The absence of iterative training eliminates common issues associated with stochastic optimization — including sensitivity to initialization, unpredictable convergence paths, and gradient noise. Perturbation analysis further demonstrates stable behavior: small, uniformly applied modifications to the attention matrices produce smooth, monotonic variations in loss, with an effective stability coefficient near k ≈ 1.8. This indicates that the solution behaves predictably and remains well-conditioned under structured variations in input. The algebraic nature of the method also confers strong interpretability. Every transformation, from the contextual matrices Q, K, and V to the final mapping W*, is explicit and invertible, enabling complete traceability of how each component of the input contributes to the output. This results in a transparent computational pipeline, in contrast to the opaque weight distributions that emerge from stochastic gradient descent. The formulation extends naturally to multi-head attention mechanisms and large-matrix architectures, offering a pathway to scalable deterministic transformers. By replacing probabilistic search with analytic resolution, the Cekirge Method establishes a mathematically grounded alternative to conventional learning. The framework provides deterministic convergence, structural clarity, and reproducible outcomes, laying the foundation for a new class of explainable and reliable artificial intelligence systems.
VL  - 9
IS  - 2
ER  -

Copy | Download